Asian and Low-Resource Language Information Processing (TALLIP)


Search Issue
enter search term and/or author name


ACM Transactions on Asian Language Information Processing (TALIP), Volume 2 Issue 3, September 2003

Surprise! What's in a Cebuano or Hindi Name?
Jonathan May, Ada Brunstein, Prem Natarajan, Ralph Weischedel
Pages: 169-180
DOI: 10.1145/979872.979873
Empirical results are presented for creating training data and training a statistical name learning algorithm on Cebuano and Hindi in roughly three weeks time. The empirical study compares performance in a compressed time frame against performance of...

Hindi-english cross-lingual question-answering system
Satoshi Sekine, Ralph Grishman
Pages: 181-192
DOI: 10.1145/979872.979874
We developed a cross-lingual, question-answering (CLQA) system for Hindi and English. It accepts questions in English, finds candidate answers in Hindi newspapers, and translates the answer candidates into English along with the context surrounding...

Adaptive Hindi OCR using generalized Hausdorff image comparison
Huanfeng Ma, David Doermann
Pages: 193-218
DOI: 10.1145/979872.979875
We present an adaptive Hindi OCR implemented as part of a rapidly retargetable language tool effort. The system includes: script identification, character segmentation, training sample creation, and character recognition. In script identification,...

Making MIRACLEs: Interactive translingual search for Cebuano and Hindi
Daqing He, Douglas W. Oard, Jianqiang Wang, Jun Luo, Dina Demner-Fushman, Kareem Darwish, Philip Resnik, Sanjeev Khudanpur, Michael Nossal, Michael Subotin, Anton Leuski
Pages: 219-244
DOI: 10.1145/979872.979876
Searching is inherently a user-centered process; people pose the questions for which machines seek answers, and ultimately people judge the degree to which retrieved documents meet their needs. Rapid development of interactive systems that use...

Cross-lingual C*ST*RD: English access to Hindi information
Anton Leuski, Chin-Yew Lin, Liang Zhou, Ulrich Germann, Franz Josef Och, Eduard Hovy
Pages: 245-269
DOI: 10.1145/979872.979877
We present C*ST*RD, a cross-language information delivery system that supports cross-language information retrieval, information space visualization and navigation, machine translation, and text summarization of single documents and clusters of...

Cross-language headline generation for Hindi
Bonnie Dorr, David Zajic, Richard Schwartz
Pages: 270-289
DOI: 10.1145/979872.979878
This paper presents new approaches to headline generation for English newspaper texts, with an eye toward the production of document surrogates for document selection in cross-language information retrieval. This task is difficult because the...

Rapid development of Hindi named entity recognition using conditional random fields and feature induction
Wei Li, Andrew McCallum
Pages: 290-294
DOI: 10.1145/979872.979879
This paper describes our application of conditional random fields with feature induction to a Hindi named entity recognition task. With only five days development time and little knowledge of this language, we automatically discover relevant features...

Rapid customization of an information extraction system for a surprise language
Diana Maynard, Valentin Tablan, Kalina Bontcheva, Hamish Cunningham
Pages: 295-300
DOI: 10.1145/979872.979880
This paper describes the rapid adaptation for surprise languages of a flexible and robust Information Extraction system based on GATE, a portable Natural Language Processing infrastructure. Our experiences show that even without a native speaker and...