ACM Transactions on Asian Language Information Processing (TALIP), Volume 9 Issue 2, June 2010

A Unified Character-Based Tagging Framework for Chinese Word Segmentation
Hai Zhao, Chang-Ning Huang, Mu Li, Bao-Liang Lu
Article No.: 5
DOI: 10.1145/1781134.1781135

Chinese word segmentation is an active area in Chinese language processing though it is suffering from the argument about what precisely is a word in Chinese. Based on corpus-based segmentation standard, we launched this study. In detail, we...

A Linguistically Inspired Statistical Model for Chinese Punctuation Generation
Yuqing Guo, Haifeng Wang, Josef van Genabith
Article No.: 6
DOI: 10.1145/1781134.1781136

This article investigates a relatively underdeveloped subject in natural language processing---the generation of punctuation marks. From a theoretical perspective, we study 16 Chinese punctuation marks as defined in the Chinese national standard...

Topic-Dependent Language Model with Voting on Noun History
Welly Naptali, Masatoshi Tsuchiya, Seiichi Nakagawa
Article No.: 7
DOI: 10.1145/1781134.1781137

Language models (LMs) are an important field of study in automatic speech recognition (ASR) systems. LM helps acoustic models find the corresponding word sequence of a given speech signal. Without it, ASR systems would not understand the language...