ML4HMT-12 Workshop Update: Thanks to all presenters, participants and local organisation for an interesting workshop!

Workshop Purpose and Theme

The workshop and associated shared task are an effort to trigger a systematic investigation on improving state-of-the-art hybrid machine translation, making use of advanced machine-learning (ML) methodologies. It follows the ML4HMT-11 workshop which took place last November in Barcelona. The first workshop also road-tested a shared task (and associated data set) and laid the basis for a broader reach in 2012.

Organisational Details

Please note: our workshop is scheduled to start at 9am as we have to fit a tight schedule. The workshop will take place at the Victor Menezes Convention Centre, Seminar Hall 3, 1st floor.

Programme

You can download the workshop programme (PDF format, 90 KB) here.

The full proceedings (PDF format, 1.4 MB) and the corresponding reference (BibTex format) are available as well.

9:00Josef van Genabith
Welcome and introductory remarks
Slides (PDF format, 439 KB)
9:15Vassilina Nikoulina, Agnes Sandor, Marc Dymetman
Hybrid Adaptation of Named Entity Recognition for Statistical Machine Translation
Paper (PDF format, 193 KB) — Reference (BibTex format) — Slides (PDF format, 562 KB)
9:40Maoxi Li, Mingwen Wang, presented by Feifei Zhai
Confusion Network Based System Combination for Chinese Translation Output: Word-Level or Character-Level?
Paper (PDF format, 359 KB) — Reference (BibTex format) — Slides (PDF format, 1.1 MB)
10:05Kartik Asooja, Jorge Gracia, Nitish Aggarwal, Asunción Goméz Pérez, presented by Mihael Arcan
Using Cross-Lingual Explicit Semantic Analysis for Improving Ontology Translation
Paper (PDF format, 139 KB) — Reference (BibTex format) — Slides (PDF format, 945 KB)
10:30Xiaofeng Wu, Tsuyoshi Okita, Josef van Genabith, Qun Liu
System Combination with Extra Alignment Information
Paper (PDF format, 149 KB) — Reference (BibTex format) — Slides (PDF format, 372 KB)
10:50Tsuyoshi Okita, Antonio Toral, Josef van Genabith
Topic Modeling-based Domain Adaptation for System Combination
Paper (PDF format, 127 KB) — Reference (BibTex format) — Slides (PDF format, 920 KB)
11:10Tsuyoshi Okita, Raphaël Rubino, Josef van Genabith, presented by John Judge
Sentence-Level Quality Estimation for MT System Combination
Paper (PDF format, 133 KB) — Reference (BibTex format) — Slides (PDF format, 2.4 MB)
11:30Tea break
11:45Tsuyoshi Okita
Neural Probabilistic Language Model for System Combination
Paper (PDF format, 143 KB) — Reference (BibTex format) — Slides (PDF format, 419 KB)
12:05Christian Federmann
System Combination Using Joint, Binarised Feature Vectors
Paper (PDF format, 428 KB) — Reference (BibTex format) — Slides (PDF format, 800 KB)
12:25Christian Federmann, Tsuyoshi Okita, Maite Melero, Marta Ruiz Costa-Jussà, Toni Badia, Josef van Genabith
Results of the ML4HMT-12 Shared Task
Paper (PDF format, 111 KB) — Reference (BibTex format) — Slides (PDF format, 434 KB)
12:30

Discussion Panel

Panelists: Jan Hajič, Qun Liu, Hans Uszkoreit, Josef van Genabith

Topics include:

  1. The Future of Hybrid MT: is there a single-paradigm winner?
  2. Will we see increasing usage of additional, potentially highly sparse, features?
  3. Will research efforts in Machine Translation and Machine Learning converge?
  4. How do we evaluate progress in terms of translation quality for Hybrid MT?
  5. What are the baselines? Can Human Judgment be integrated?
Slides (PDF format, 333 KB)
12:50Jan Hajič · Institute of Formal and Applied Linguistics · Charles University in Prague
Invited talk: Deep Linguistic Information in Hybrid Machine Translation
Slides (PowerPoint format, 3.4 MB)
13:30Lunch

Regular Papers ML4HMT-12

We are soliciting original papers on hybrid MT, including (but not limited to):

  • use of machine learning methods in hybrid MT;
  • system combination: parallel in multi-engine MT (MEMT) or sequential in statistical post-editing (SPMT);
  • combining phrases and translation units from different types of MT;
  • syntactic pre-/re-ordering;
  • using richer linguistic information in phrase-based or in hierarchical SMT;
  • learning resources (e.g., transfer rules, transduction grammars) for probabilistic rule-based MT.

Important Dates 2012

  • August 15th Shared Task training data release
  • August 23rd Shared Task test data release
  • October 22nd Workshop research paper submission
  • October 28th Shared Task translation results submission deadline
  • November 7th Workshop research paper accept/reject notification
  • November 12th Shared Task automatic evaluation results release
  • November 12th Shared Task system description paper submission
  • November 13th Workshop and Shared Task camera ready paper due
  • December 9th COLING 2012 Pre-conference workshop, 9am

Shared Task ML4HMT-12

Participants are invited to build hybrid machine translation systems and/or system combinations by using the output of several MT systems of different types, as provided by the organisers (ML4HMT corpus).

Acknowledgments

The ML4HMT workshop is supported by META-NET.


ML4HMT-11 Workshop

The precursor workshop ML4HMT-11 was held on November 19th, 2011 in Barcelona, Spain. The original call for papers is available here, the workshop program—including papers and presentations—can be found here.

About META Work Package 2

An overview on T4ME WP2 »Optimising the Division of Labour in Hybrid MT«, funded by DG INFSO through the Seventh Framework Programme, grant agreement no.: 249119, is available here.