ACM Transactions on Asian Language Information Processing (TALIP), Volume 6 Issue 2, September 2007

A phonetic similarity model for automatic extraction of transliteration pairs
Jin-Shea Kuo, Haizhou Li, Ying-Kuei Yang
Article No.: 6
DOI: 10.1145/1282080.1282081

This article proposes an approach for the automatic extraction of transliteration pairs from Chinese Web corpora. In this approach, we formulate the machine transliteration process using a syllable-based phonetic similarity model which consists of...

The study of a nonstationary maximum entropy Markov model and its application on the pos-tagging task
Jinghui Xiao, Xiaolong Wang, Bingquan Liu
Article No.: 7
DOI: 10.1145/1282080.1282082

Sequence labeling is a core task in natural language processing. The maximum entropy Markov model (MEMM) is a powerful tool in performing this task. This article enhances the traditional MEMM by exploiting the positional information of language...

Interactive high-dimensional index for large Chinese calligraphic character databases
Yl Zhuang, Yueting Zhuang, Qing Li, Lei Chen
Article No.: 8
DOI: 10.1145/1282080.1282083

The large numbers of Chinese calligraphic scripts in existence are valuable part of the Chinese cultural heritage. However, due to the shape complexity of these characters, it is hard to employ existing techniques to effectively retrieve and...