Asian and Low-Resource Language Information Processing (TALLIP)


Search Issue
enter search term and/or author name


ACM Transactions on Asian Language Information Processing (TALIP), Volume 8 Issue 4, December 2009

Introduction to the Special Issue on Arabic Natural Language Processing
K. Shaalan, A. Farghaly
Article No.: 13
DOI: 10.1145/1644879.1644880

Arabic Natural Language Processing: Challenges and Solutions
Ali Farghaly, Khaled Shaalan
Article No.: 14
DOI: 10.1145/1644879.1644881

The Arabic language presents researchers and developers of natural language processing (NLP) applications for Arabic text and speech with serious challenges. The purpose of this article is to describe some of these challenges and to present some...

Discriminative Phrase-Based Models for Arabic Machine Translation
Cristina España-Bonet, Jesús Giménez, Lluís Màrquez
Article No.: 15
DOI: 10.1145/1644879.1644882

A design for an Arabic-to-English translation system is presented. The core of the system implements a standard phrase-based statistical machine translation architecture, but it is extended by incorporating a local discriminative phrase selection...

Morphology-Based Segmentation Combination for Arabic Mention Detection
Yassine Benajiba, Imed Zitouni
Article No.: 16
DOI: 10.1145/1644879.1644883

The Arabic language has a very rich/complex morphology. Each Arabic word is composed of zero or more prefixes, one stem and zero or more suffixes. Consequently, the Arabic data is sparse compared to other languages such as...

Cross-Language Information Propagation for Arabic Mention Detection
Imed Zitouni, Radu Florian
Article No.: 17
DOI: 10.1145/1644879.1644884

In the last two decades, significant effort has been put into annotating linguistic resources in several languages. Despite this valiant effort, there are still many languages left that have only small amounts of such resources. The goal of this...

Automatic Speech-to-Text Transcription in Arabic
Lori Lamel, Abdelkhalek Messaoudi, Jean-Luc Gauvain
Article No.: 18
DOI: 10.1145/1644879.1644885

The Arabic language presents a number of challenges for speech recognition, arising in part from the significant differences in the spoken and written forms, in particular the conventional form of texts being non-vowelized. Being a highly...

Sura Length and Lexical Probability Estimation in Cluster Analysis of the Qur’an
Hermann Moisl
Article No.: 19
DOI: 10.1145/1644879.1644886

Thabet [2005] applied cluster analysis to the Qur’an in the hope of generating a classification of the (suras) that is useful for understanding of its thematic structure. The result was positive, but variation in (sura) length was a problem...