Short Text Understanding Through Lexical-Semantic Analysis

  • Wen Hua ,
  • Zhongyuan Wang ,
  • Haixun Wang ,
  • Kai Zheng ,
  • Xiaofang Zhou

International Conference on Data Engineering (ICDE) |

Best Paper Award

Understanding short texts is crucial to many applications, but challenges abound. First, short texts do not always observe the syntax of a written language. As a result, traditional natural language processing methods cannot be easily applied. Second, short texts usually do not contain suffi cient statistical signals to support many state-of-the-art approaches for text processing such as topic modeling. Third, short texts are usually more ambiguous. We argue that knowledge is needed in order to better understand short texts. In this work, we use lexicalsemantic knowledge provided by a well-known semantic network for short text understanding. Our knowledge-intensive approach disrupts traditional methods for tasks such as text segmentation, part-of-speech tagging, and concept labeling, in the sense that we focus on semantics in all these tasks. We conduct a comprehensive performance evaluation on real-life data. The results show that knowledge is indispensable for short text understanding, and our knowledge-intensive approaches are effective in harvesting semantics of short texts.

Thanks for your interests in this paper. Please also pay attentions to our ACL 2016 short text understanding tutorial: Understanding Short Texts – ACL 2016 Tutorial (opens in new tab), presented by Zhongyuan Wang (opens in new tab) and Haixun Wang