Innovation and Networks Executive Agency


Multilingual Resources for CEF.AT in the legal domain
CEF Telecom
Call year:
Location of the Action:
Implementation schedule: 
October 2018 to March 2021
Maximum EU contribution: 
Total eligible costs: 
Percentage of EU support: 

Research Institute for Linguistics (Hungary)

Additional information: 
Last modified: 
September 2020


The overall objective of this Action is to provide automatic translation on the body of national legislation in seven countries: Bulgaria, Croatia, Hungary, Poland, Romania, Slovakia and Slovenia. At present national legislation texts are not automatically available to CEF.AT and present Machine Translation (MT) systems could be improved if they had access to national legislative texts.

The Action aims to process two resources available in all languages concerned i.e. the multilingual ontology-based thesaurus EUROVOC on the one hand and the corpora of all national legislation in the respective languages on the other. The following deliverables will be produced:

  1. Seven large-scale suitably pre-processed monolingual corpora of national legislation documents classified into EUROVOC topics/descriptors and enriched with EUROVOC and IATE terms identified.
  2. Comparable corpus of seven languages aligned at the topic level domains identified by EUROVOC descriptors.
  3. Croatian English parallel corpus consisting of 1800 legislative documents.

The Action will also have an impact both on the e-justice and the Online Dispute Resolution Digital Service Infrastructures as the resources focus on national legislation, which is of direct relevance to both DSI’s.