Page tree
Skip to end of metadata
Go to start of metadata

Customizing Machine Translation

Agenda Workshop #3

25 April 2019, Hilton Brussels Grand Place, Carrefour de l'Europe, Brussels

Presentations are now available!

10:00 - 10:10Welcome (Luc Meertens)
10:10 - 10:35Objectives of DG Connect (Philippe Gelin, European Commission)
10:35 - 11:00Update of Status eTranslation (Andreas Eisele, European Commission)
11:00 - 11:15Coffee Break
11:15 - 11:40Sharing public services data to obtain better MT quality (FOD Kanselarij & Joachim van den Bogaert)
11:40 - 12:05MT Training Workflow (Sara Szoc, CrossLang)
12:05 - 12:30Human and automatic evaluation of machine translation output  (Lieve Macken, LT3)
12:30 - 13:30Lunch Break
13:30- 13:55Terminology within MT (Andrejs Vasiljevs, Tilde)
13:55- 14:20Publicly available corpora (Khalid Choukri, ELDA)
14:20- 14:45Tools for Data Gathering - DIY (Tom Vanallemeersch, CrossLang)
14:45 - 15:00Coffee Break
15:00 - 16:00Round Table Discussion with participation of DSIs, moderated by Luc Meertens
How to proceed from here?
16:00End of the workshop

How can you participate?

Participation is free of charge, but registration is required. Click here to register now!

Any questions?

Feel free to contact us!

email Contact

Have a look at the post-event article on the ELRC Website

view Read the article

Further Details on the Presentations

Luc Meertens, Welcome

Alexandru Ceausu, Objectives of DG Connect

The overall objective is to contribute to the development of CEF Automated Translation as a “multilingualism enabler” for CEF DSIs, online services linked to CEF DSIs and other relevant public online services. The specific objectives are
(i) to gather information on additional needs of the CEF DSIs and public services;
(ii) to analyze the range of services which could extend CEF Automated Translation and
(iii) to support CEF DSIs and related systems with a view to maximizing their use of CEF Automated Translation services.

Andreas Eisele, Update of status eTranslation

FOD Kanselarij + Joachim van den Bogaert, Sharing public services data to obtain better MT quality

In its-day-to-day activities, the translation department of the Belgian Chancellery of the Prime Minister covers a wide range of topics. For some of these topics, specialized MT systems would be helpful (for example, documents related to legal or policy matters), while for other topics, a broad-domain engine would be more suitable (for example, press releases). In this talk, we explore the differences between broad domain and domain-specific translation from both the user and provider perspective. We discuss how public services can benefit from tailored MT solutions, and how they can contribute to the development of better MT systems at a European level.

Sara Szoc, MT Training Workflow 

This presentation deals with the typical workflow for training an MT system, i.e incremental training. The MT system is regularly being retrained as more and more parallel data become available (such as a growing translation memory). Previous versions of the MT system act as a baseline which is improved upon, in terms of translation quality. The concept of baseline also applies in case of domain-specific MT: the latter should improve upon the general-purpose MT system.

Lieve Macken, Human and automatic evaluation of machine translation output

This talk compares human and automatic evaluation methods. Automatic evaluation metrics are typically used during the development of machine translation systems, for example to quickly compare successive versions of a single system with each other. Human evaluation of MT output is highly informative, but it is expensive in terms of time and expert human effort and may suffer from a lack of consistency.

Andrejs Vasiljevs, Terminology within MT

Khalid Choukri, Publicly available corpora

Tom Vanallemeersch, Tools for Data Gathering - DIY

While parallel corpora are available to a limited extent, much more parallel information can be found in document archives or online resources, for instance inside multilingual web sites. This presentation discusses tools and procedures that allow for automatically detecting equivalent documents or web pages and for linking equivalent sentences within these pairs of documents or pages. Such equivalent sentences allow, for instance, for the creation of a domain-specific parallel corpus.

Luc Meertens, Round Table Discussion “How to proceed from here ?”

  • No labels