Cogprints

Statistical Phrase-based Post-editing

Simard, Michel and Goutte, Cyril and Isabelle, Pierre (2007) Statistical Phrase-based Post-editing. [Conference Paper]

Full text available as:

[img]
Preview
PDF
219Kb

Abstract

We propose to use a statistical phrase-based machine translation system in a post-editing task: the system takes as input raw machine translation output (from a commercial rule-based MT system), and produces post-edited target-language text. We report on experiments that were performed on data collected in precisely such a setting: pairs of raw MT output and their manually post-edited versions. In our evaluation, the output of our automatic post-editing (APE) system is not only better quality than the rule-based MT (both in terms of the BLEU and TER metrics), it is also better than the output of a state-of-the-art phrase-based MT system used in standalone translation mode. These results indicate that automatic post-editing constitutes a simple and efficient way of combining rule-based and statistical MT technologies.

Item Type:Conference Paper
Keywords:Machine Translation, Post-editing, Statistical MT, Phrase-based MT
Subjects:Linguistics > Computational Linguistics
Computer Science > Machine Learning
Computer Science > Artificial Intelligence
ID Code:5627
Deposited By:Goutte, Dr. Cyril
Deposited On:28 Jul 2007
Last Modified:11 Mar 2011 08:56

References in Article

Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.

Jeffrey Allen and Christofer Hogan. 2000. Toward the development of a post-editing module for Machine Translation raw output: a new productivity tool for processing controlled language. In Third International Controlled Language Applications Workshop (CLAW2000), Washington, USA.

Jeffrey Allen. 2004. Case study: Implementing MT for the translation of pre-sales marketing and post-sales software deployment documentation. In Proceedings of AMTA-2004, pages 1--6, Washington, USA.

Peter~F Brown, Stephen A~Della Pietra, Vincent J~Della Pietra, and Robert~L Mercer. 1993. The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19(2):263--311.

Jakob Elming. 2006. Transformation-based corrections of rule-based MT. In Proceedings of the EAMT 11th Annual Conference, Oslo, Norway.

George Foster, Roland Kuhn, and Howard Johnson. 2006. Phrasetable Smoothing for Statistical Machine Translation. In Proceedings of EMNLP 2006, pages 53--61, Sydney, Australia.

Kevin Knight and Ishwar Chander. 1994. Automated Postediting of Documents. In Proceedings of National Conference on Artificial Intelligence, pages 779--784, Seattle, USA.

Philipp Koehn, Franz~J. Och, and Daniel Marcu. 2003. Statistical Phrase-Based Translation. In Proceedings of HLT-NAACL 2003, pages 127--133, Edmonton, Canada.

Philipp Koehn. 2004. Pharaoh: a Beam Search Decoder for Phrase-Based Statistical Machine Translation Models. In Proceedings of AMTA 2004, pages 115--124, Washington, USA.

Daniel Marcu and William Wong. 2002. A Phrase-Based, Joint Probability Model for Statistical Machine Translation. In Proceedings of EMNLP 2002, Philadelphia, USA.

Franz~Josef Och. 2003. Minimum error rate training in Statistical Machine Translation. In Proceedings of ACL-2003, pages 160--167, Sapporo, Japan.

Fatiha Sadat, Howard Johnson, Akakpo Agbago, George Foster, Roland Kuhn, Joel Martin, and Aaron Tikuisis. 2005. PORTAGE: A Phrase-Based Machine Translation System. In Proceedings of the ACL Workshop on Building and Using Parallel Texts, pages 129--132, Ann Arbor, USA.

Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A Study of Translation Edit Rate with Targeted Human Annotation. In Proceedings of AMTA-2006, Cambridge, USA.

Metadata

Repository Staff Only: item control page