Horizon 2020
The EU Framework Programme for Research and Innovation

Launch of the READ project for the automated recognition and enrichment of documents

The overall objective of the EU-funded READ project is to implement a Virtual Research Environment where archivists, humanities scholars, computer scientists and volunteers are collaborating with the ultimate goal of boosting research, innovation, development and usage of cutting edge technology for the automated recognition, transcription, indexing and enrichment of handwritten archival documents.

This ICT based e-infrastructure will address the Societal Challenges mentioned in "Europe in a Changing World" namely the 'transmission of European cultural heritage' and the 'uses of the past' as one of the core requirements of a reflective society

Specific measures of the READ project will be:

  • Run an Open Platform where users are able to upload documents and to use the technology (software as a service), to train Handwritten Text Recognition and similar technologies for their own purposes, but also to benefit from synergies (e.g. already trained models from other users). A prototype is already available;
  • Carry out collaborative research in the domains of Digital Humanities, Pattern Recognition, Layout Analysis, Natural Language Processing, etc.;
  • Develop several innovative applications, such as e-learning components, or mobile devices
  • Organise research competitions to boost research in the relevant domains based on large datasets covering "real world data";
  • Launch speficic actions to make the technology available to a larger audience (Citizen Science);
  • Involve archives, libraries, etc. with their handwritten/archival collections.

Based on research and innovation enabled by the READ Virtual Research Environment, we will be able to explore and access hundreds of kilometres of archival documents via full-text search and therefore be able to open up one of the last hidden treasures of Europe's rich cultural heritage.

READ is managed by a multidisciplinary consortium of 13 partners working in Computer Science, Pattern Recognition, Machine Learning, Image Processing and Humanities. The project builds on the results from the FP7 Project TranScriptorium.

READ will run for 3.5 years, with a total budget of exceeding 8 million Euros.

Project information

Name: 
Recognition and Enrichment of Archival Documents
Acronym: 
READ
Article
#H2020