EU Science Hub

JRC-Names

 

What is JRC-Names?

JRC-Names is a highly multilingual named entity resource for person and organisation names (called 'entities'). It consists of large lists of names and their many spelling variants (up to hundreds for a single person), including across scripts (Latin, Greek, Arabic, Cyrillic, Japanese, Chinese, etc.). Since March 2016, JRC-Names has also been available as linked data, including additional information such as frequencies per language, titles found with the entities, and date ranges.

 

What can JRC-Names be used for?

JRC-Names is a technical resource that can be used to find names even if they are spelled differently, but it is also a useful ingredient for IT systems that process text, e.g. for text mining.

 

How was JRC-Names produced?

JRC-Names is a by-product of the analysis of about 220,000 news reports per day by the Europe Media Monitor (EMM) family of applications.

 

Statistics on JRC-Names

JRC-Names contains the most important names of the EMM name database, i.e. those names that were found frequently or that were verified manually or found on Wikipedia.

 

Related information

A description of JRC-Names (version 1) was published in the publication below. Information on the Linked Data version of JRC-Names can be found in the second paper. Please use these publications as a reference when you refer to JRC-Names:

 

Usage conditions

By downloading and/or using JRC-Names, you agree to the usage conditions formulated in the licence, which is available at http://optima.jrc.it/Resources/LICENCE-EULA_JRC-Names_2011.pdf.

 

Privacy statement

JRC-Names is subject to a privacy statement.

 

Download JRC-Names

Depending on your needs, you may want to download part or all of the following components: