Research Institute for Linguistics of the Hungarian Academy of Sciences (Hungary)
Digital Single Market (DSM) strategy
DSM - Connecting Europe Facility
CEF Digital portal
Innovation and Networks Executive Agency (INEA)
The overall objective of this Action is to provide automatic translation on the body of national legislation in seven countries: Bulgaria, Croatia, Hungary, Poland, Romania, Slovakia and Slovenia. At present national legislation texts are not automatically available to CEF.AT and present Machine Translation (MT) systems could be improved if they had access to national legislative texts.
The Action aims to process two resources available in all languages concerned i.e. the multilingual ontology-based thesaurus EUROVOC on the one hand and the corpora of all national legislation in the respective languages on the other. The following deliverables will be produced:
1) Seven large-scale suitably pre-processed monolingual corpora of national legislation documents classified into EUROVOC topics/descriptors and enriched with EUROVOC and IATE terms identified.
2) Comparable corpus of seven languages aligned at the topic level domains identified by EUROVOC descriptors.
3) Croatian English parallel corpus consisting of 1800 legislative documents.
The Action will also have an impact both on the e-justice and the Online Dispute Resolution Digital Service Infrastructures as the resources focus on national legislation, which is of direct relevance to both DSI’s.