A new software developed by JRC's language technology experts can automatically categorise parliamentary documents in 22 official EU languages according to EuroVoc, the EU's multilingual thesaurus. The software tool called JEX, "JRC EuroVoc Indexer", developed by JRC's language technology experts, can make the work of national parliaments' libraries and documentation centres easier and, in turn, facilitates citizens' access to legislation across EU borders.
To be able to retrieve relevant documents efficiently – even if written in a different language – libraries need to categorise their documents using a closed set of subject domain classes, i.e. a controlled vocabulary from a thesaurus. EuroVoc is the standard thesaurus used in most EU institutions and also in many EU Member States. It contains over 6,700 classes covering the activities of the EU, in particular those of the European Parliament. The EuroVoc labels have been translated one-to-one into all EU languages.
Currently, most parliamentary libraries manually assign EuroVoc subject domain labels to their documents, which is a slow and expensive process. The JRC software tool JEX can automatically or semi-automatically categorise documents according to the thousands of EuroVoc classes in 22 official EU languages, thus significantly improving the work speed and efficiency, while assuring consistency in the classification.
JRC's language tools are highly multilingual language resources which cover a wide range beyond the most commonly used languages and contribute to the European Commission’s general effort to support multilingualism.
The Multilingual Europe Technology Alliance (META) has recognised JRC's Activity on Language Technology with the META Prize 2012 at the META-FORUM 2012 in Brussels on June 20. The META Prize is awarded to outstanding products, services and organisations that actively contribute to the European Multilingual Information Society.