Page tree

eTranslation Documentation

How does it work?


CEF eTranslation builds on the European Commission’s earlier machine translation service , MT@EC, which was developed by the Directorate-General for Translation (DGT) under the Interoperability Solutions for European Public Administrations (ISA) programme.  It was based on MOSES open-source translation toolkit, a Statistical Machine Translation (SMT) system developed with co-funding from EU research and innovation programmes, while eTranslation is following the field’s move into neural machine translation.

Both systems are trained using the vast Euramis translation memories, comprising over 1 billion sentences in the 24 official EU languages produced by the translators of the EU institutions over the past decades. As such, they are particularly suited for the needs of EU policy documents.

CEF eTranslation carries on the MT@EC functionalities and scales-up the existing infrastructure by improving quality and performance and increasing the pool of available language resources.

Features of the eTranslation service

The eTranslation service provides the ability to translate formatted documents and plain text between any pair of EU official languages, as well as Norwegian (Bokmål) and Icelandic, while preserving to the greatest extent possible the structure and format of those documents.

  • Available in the 24 official EU languages, Icelandic and Norwegian (Bokmål)
  • Translates multiple documents to multiple languages in one go
  • Accepts several input formats: .txt, .doc, .docx, .odt,.ott, .rtf, .xls, .xlsx, .ods, .ots, .ppt, .pptx, .odp, .otp, .odg, .otg, .htm, .html, .xhtml, .h, .xml, .xlf, .xliff, .sdlxliff, .rdf, .tmx and pdf.
  • Source document format/formatting maintained [not for pdf]
  • Specific output formats for computer-aided translation: tmx and xliff
  • Automatic language detection is supported for the translation of text snippets for texts at least 30 characters long
  • Translations can be returned by email or stored on the interface
  • Monitors progress of translation requests
  • Feedback mechanism
  • The interface will support the selection of specific domains (subject areas) of translation

Two types of use

eTranslation can be used in two distinct ways:

 

1. One-off translations 

It provides a web user interface for direct use by individuals (human-to-machine use).

2. Integrated machine translation functionality

It provides machine translation capabilities for digital services through an API (Application Programming Interface) (machine-to-machine use).

 

Language resource collection

To improve upon the quality and coverage of the service, CEF eTranslation requires a much larger scope of language resources and translation data.

The European Commission has therefore launched a comprehensive European Language Resource Coordination (ELRC) effort to identify and gather language and translation data relevant to national public services, administrations and governmental institutions across all 30 European countries participating in the CEF programme.