BIKE: Bilingual Keyphrase Experiments

Nadeau, David and Barrière, Caroline and George, Foster (2005) BIKE: Bilingual Keyphrase Experiments. [Conference Paper]

Full text available as:



This paper presents a novel strategy for translating lists of keyphrases. Typical keyphrase lists appear in scientific articles, information retrieval systems and web page meta-data. Our system combines a statistical translation model trained on a bilingual corpus of scientific papers with sense-focused look-up in a large bilingual terminological resource. For the latter, we developed a novel technique that benefits from viewing the keyphrase list as contextual help for sense disambiguation. The optimal combination of modules was discovered by a genetic algorithm. Our work applies to the French / English language pair.

Item Type:Conference Paper
Keywords:statistical machine translation, lexical resources, keyphrase list.
Subjects:Computer Science > Language
ID Code:4603
Deposited By:Nadeau, David
Deposited On:12 Nov 2005
Last Modified:11 Mar 2011 08:56

References in Article

Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.

Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer (1993). The Mathematics of Machine Translation: Parameter Estimation. Computational Linguistics 19(2), 1993.

Yunbo Cao and Hang Li (2002). Base Noun Phrase Translation Using Web Data and the EM Algorithm. Coling02.

Nigel Collier, Hideki Hirakawa, and Akira Kumano (1998). Machine Translation vs. Dictionary Term Translation - a Comparison for English-Japanese News Article Alignment. ColingACL98.

P. Fung and K. McKeown (1997). A Technical Wordand Term-Translation Aid Using Noisy Parallel Corpora across Language Groups. Machine Translation.

Philipp Koehn and Kevin Knight (2003). Feature-Rich Statistical Translation of Noun Phrases. ACL03.

Philipp Koehn, Franz Josef Och, and Daniel Marcu (2003). Statistical Phrase-Based Translation. NAACL03.

Julian Kupiec (1993). An Algorithm For Finding Noun Phrase Correspondences In Bilingual Corpora. ACL93.

Robert C. Moore (2003). Learning Translations of Named Entity Phrases from Parallel Corpora. EACL03.

Fatiha Sadat, Akira Maeda, Masatoshi Yoshikawa, and Shunsuke Uemura 2001. Query Expansion Techniques for the CLEF Bilingual Track. In Proceedings of the CLEF 2001 Workshop on Cross-language System Evaluation Campaign, pp. 99-104, Darmstadt, Germany.

Nizar Hibash and Bonnie Dorr (2002). Handling Translation Divergences: Combining Statistical and Symbolic Techniques in Generation-Heavy Machine Translation. AMT02.

Judith Klavans and Philipp Resnik (1994). The Balancing Act: Combining Symbolic and Statistical Approaches to Language. ACL Workshop Proceedings.

David Nadeau, Mario Jarmasz, Caroline Barrière, George Foster and Claude St-Jacques (2004). Using COTS Search Engines and Custom Query Strategies at CLEF. Cross-Language Evaluation Forum CLEF 2004.


Repository Staff Only: item control page