enter search term and/or author name
The surprise language exercises
Douglas W. Oard
For ten days in March and twenty-nine days in June of 2003, sixteen teams in two nations sought to develop language technologies for two previously unanticipated languages; Cebuano and Hindi. This introduction to a pair of special issues explains the...
A month to topic detection and tracking in Hindi
James Allan, Victor Lavrenko, Margaret E. Connell
We describe the one-month (June 2003) effort to create a topic detection and tracking (TDT) system to support news stories in Hindi. The University of Massachusetts submitted results for three different TDT tasks in the DARPA surprise language...
Linguistic resource creation for research and technology development: A recent experiment
Stephanie Strassel, Mike Maxwell, Christopher Cieri
Advances in statistical machine learning encourage language-independent approaches to linguistic technology development. Experiments in "porting" technologies to handle new natural languages have revealed a great potential for multilingual computing,...
Rapid porting of DUSTer to Hindi
Bonnie J. Dorr, Necip Fazil Ayan, Nizar Habash, Nitin Madnani, Rebecca Hwa
The frequent occurrence of divergences—structural differences between languages---presents a great challenge for statistical word-level alignment and machine translation. This paper describes the adaptation of DUSTer, a divergence...
Extracting named entity translingual equivalence with limited resources
Fei Huang, Stephan Vogel, Alex Waibel
In this article we present an automatic approach to extracting Hindi-English (H-E) Named Entity (NE) translingual equivalences from bilingual parallel corpora. In the absence of a Hindi NE tagger or H-E translation dictionary, this approach adapts a...
Hindi CLIR in thirty days
Leah S. Larkey, Margaret E. Connell, Nasreen Abduljaleel
As participants in the TIDES Surprise language exercise, researchers at the University of Massachusetts helped collect Hindi--English resources and developed a cross-language information retrieval system. Components included normalization, stop-word...
Experiments with a Hindi-to-English transfer-based MT system under a miserly data scenario
Alon Lavie, Stephan Vogel, Lori Levin, Erik Peterson, Katharina Probst, Ariadna Font Llitjós, Rachel Reynolds, Jaime Carbonell, Richard Cohen
We describe an experiment designed to evaluate the capabilities of our trainable transfer-based (Xfer) machine translation approach, as applied to the task of Hindi-to-English translation, and trained under an extremely limited data scenario. We...
Cross-lingual retrieval for Hindi
Jinxi Xu, Ralph Weischedel
In this paper we describe the evaluation results of applying a cross-lingual retrieval model to retrieve Hindi documents relevant to an English query. Though the technique has been previously applied and evaluated for retrieving Chinese and Arabic...