Page tree

Data collection and curation


About the service

The European Commission has launched a comprehensive European Language Resource Coordination (ELRC) effort to identify and gather language and translation data relevant to national public services, administrations and governmental institutions across all 30 European countries participating in the CEF programme. These resources are needed in order to improve the quality and the coverage of the machine translation engines in eTranslation. All data resources gathered in this ELRC initiative will therefore be used to develop a high-quality machine translation service.

Ready to get started?


Learn more about how to contribute language resources or visit ELRC-SHARE directly to browse and access language resources


Users

  • Data providers: providers of language and translation data relevant to national public services, administrations and governmental institutions across all 30 European countries participating in the CEF programme in the form of large general domain corpora, whether monolingual or multilingual parallel corpora  as well as domain-specific corpora and terminology resources (e.g. lexica and dictionaries) in the fields of consumer rights, culture, legal domain, social security, health, public procurement, etc. 

Benefits

  • Increased language coverage: collection of resources to cover the languages from the 30 European countries participating in the CEF programme. Additional coverage for languages with currently less resources.
  • Increased domain coverage: collection of domain-specific corpora and terminology resources (e.g. lexica and dictionaries) in the fields of consumer rights, culture, legal domain, social security, health, public procurement, etc. allowing the training of domain specific engines for the eTranslation service.
  • The technical processing services guarantee that the provided language resources will lead to higher quality automated translation systems.





Documentation








MANAGED SERVICES
Machine translation
Data collection and curation
SUPPORTING SERVICES
Service desk for machine translation
Service desk for language resources
STAKEHOLDER MANAGEMENT SERVICES
National Anchor Points for data collection