Expanding horizons in digital humanities research
Access to digitised texts, recordings and other documents from repositories across Europe offers a major resource for humanities research. An EU-funded project has contributed to developing a pan-European digital infrastructure to provide wider access to such information - boosting research in the field.
© viperagp #43128184, source: fotolia.com 2018
Language is the prime source-material for research in the humanities in linguistics, social sciences, history and psychology among others. Words and their organisation are the major carrier of human knowledge, emotion and culture. Language resources can be written or spoken and are stored across Europe in, for example, libraries, universities and public and private record collections and archives.
Many European countries have centres of expertise for language resources and have made great efforts to digitise these and make them more widely accessible online. However, these resources are dispersed and often use incompatible systems for cataloguing, storing, accessing and searching the information available.
This lack of common standards is a major barrier to pan-European humanities research based on written and oral records. In response, the EU-funded CLARIN network was established in 2012 to make digital language resources and tools from all over Europe and beyond accessible to researchers in the humanities and social sciences.
The EU-funded CLARIN-PLUS project has opened this research infrastructure to the wider world. Earlier work built a pan-European federation of language resource centres and set out common digital standards. This work was primarily internal and preparatory.
CLARIN-PLUS centred on the externalisation of this CLARIN infrastructure. It has extended the infrastructure partnership to 40 centres across the Member States. A new state-of-the-art online central platform offers much improved accessibility for researchers and more visibility to the wider world.
The project has improved the structure and content of the CLARIN portal and implemented and shared the latest web-technologies with the consortium members.
CLARIN-PLUS sets out a value proposition for researchers in the humanities, says project coordinator Franciska de Jong of the Utrecht Institute of Linguistics. By reaching out to other research communities, organisations and countries, explaining what they need to do to join us, we are growing the CLARIN network.
De Jong is executive director of CLARIN ERIC, the governing body of the CLARIN infrastructure.
Building an open infrastructure
The project has also reached out to build partnerships. For example work was conducted with sister infrastructure DARIAH on content and tools for digital humanities research, and with the Europeana initiative, to make their digital cultural collections visible though the CLARIN platform.
There are also exchanges with EUDAT to work on the design and implementation of a common infrastructure for research data, and with the EU-funded EOSC project which is building a digital infrastructure for scientific research at large.
The project also provided support for the digital curation of language resources which includes selection, preservation, maintenance, collection and archiving. To provide open access to these resources; the project contributed to establishing common definitions and standards between digital cultural collections. This work included setting common standards for metadata that describe the information in a specific database like a library catalogue.
The language-based research queries made of digital humanities databases can be very diverse. Examples include How was American consumer culture depicted in Europe throughout the 20th century?; Who was the real author of the Dutch national anthem?; What changing concepts are associated with war in newspapers?; and What do historical documents tell us about the relation between gender and work?
Language resources can help answer these questions; and widely accessible and available digital resources can answer more of them, faster.
Social media is a new resource for the humanities, where information is already digitised and growing exponentially. Researchers are interested in social media as a hot topic as it reflects what is happening in society today explains De Jong.
Studying social media data sets are a rich source for investigation, she says. For example investigating how tracking changes in language use might help detect changes in attitude towards topics such as religion, the EU, or colonial history. Interview data can reveal the onset of dementia, the perspective of minorities on historical events, or the influence of gender on conversational style, and is therefore also a rich source for researchers from multiple domains.