ACM Transactions on Asian Language Information Processing (TALIP), Volume 10 Issue 4, December 2011

Improved Chinese--English SMT with Chinese “DE” Construction Classification and Reordering
Jinhua Du, Andy Way
Article No.: 17
DOI: 10.1145/2025384.2025385

Syntactic reordering on the source side has been demonstrated to be helpful and effective for handling different word orders between source and target languages in SMT. In this article, we focus on the Chinese (DE) construction which is flexible...

Language Modeling for Syntax-Based Machine Translation Using Tree Substitution Grammars: A Case Study on Chinese-English Translation
Tong Xiao, Jingbo Zhu, Muhua Zhu
Article No.: 18
DOI: 10.1145/2025384.2025386

The poor grammatical output of Machine Translation (MT) systems appeals syntax-based approaches within language modeling. However, previous studies showed that syntax-based language modeling using (Context-Free) Treebank Grammars was not very...

Mining English-Chinese Named Entity Pairs from Comparable Corpora
Lishuang Li, Peng Wang, Degen Huang, Lian Zhao
Article No.: 19
DOI: 10.1145/2025384.2025387

Bilingual Named Entity (NE) pairs are valuable resources for many NLP applications. Since comparable corpora are more accessible, abundant and up-to-date, recent researches have concentrated on mining bilingual lexicons using comparable corpora....

User Behaviors in Related Word Retrieval and New Word Detection: A Collaborative Perspective
Zhiyuan Liu, Yabin Zheng, Lixing Xie, Maosong Sun, Liyun Ru, Yang Zhang
Article No.: 20
DOI: 10.1145/2025384.2025388

Nowadays, user behavior analysis and collaborative filtering have drawn a large body of research in the machine learning community. The goal is either to enhance the user experience or discover useful information hidden in the data. In this...

Deep Learning Approaches to Semantic Relevance Modeling for Chinese Question-Answer Pairs
Baoxun Wang, Bingquan Liu, Xiaolong Wang, Chengjie Sun, Deyuan Zhang
Article No.: 21
DOI: 10.1145/2025384.2025389

The human-generated question-answer pairs in the Web social communities are of great value for the research of automatic question-answering technique. Due to the large amount of noise information involved in such corpora, it is still a problem to...