skip to main content
Newsroom
Overview    News

12 months LDS Country Workshops

Since April 2024, nine LDS Country workshops have been held across Europe in Finland, Sweden, Luxembourg, Hungary, Ireland, Slovakia, Czechia, Belgium with The Netherlands and Denmark, with the mission of disseminating information on the LDS and establishing synergies of collaboration and support from the local stakeholders.

© Freepik

date:  09/04/2025

The events have had an awareness-raising and investigative character and allowed national/regional participants to exchange on the status of Speech and Language Technologies, on the work done on LLMs and Generative AI and on the use and provision of data in their respective countries.

Challenges and opportunities

The participating representatives from industry, academia and public sector agreed that concerns surrounding language data in the context of Generative AI are indeed significant, but they also present opportunities for improvement and innovation. Key issues include:

  • Disinformation and bias: While initial excitement around Generative AI is tempered by concerns over disinformation and biases – especially for less-resourced languages – LDS helps by promoting more inclusive data sharing, ensuring balanced, representative datasets that minimise bias and improve model accuracy across languages.
  • Data quality vs. quantity: Industrial stakeholders highlight the discrepancy between big data versus high-quality, domain-pertinent language data, which is, again, particularly critical for less-resourced languages. LDS fosters collaboration to source tailored, relevant data, especially for underrepresented languages, improving model effectiveness.
  • Legal and IP challenges: Legal aspects, particularly data protection laws as well as intellectual property concerns, have made data sourcing a complex process. LDS addresses this by promoting secure, transparent data sharing in compliance with current regulations, safeguarding privacy and intellectual property rights.

Also for private companies, some of the above-mentioned elements are often obstacles to data sharing, in addition to the costs associated with data management, possibly restrictive third-party agreements and the natural hesitation to share critical business data. However, LDS provides a secure framework that enables controlled data sharing without exposing the confidentiality of corporate data. By promoting trust and transparency, LDS creates an environment favouring cooperation that takes into account the competitive nature of companies.

Synergies and collaborations

LDS Country workshops have also portrayed a variety of frameworks in the language data industry, with national contexts that sometimes allow LT developers to obtain their data through national collaborations with, for instance, national libraries or broadcasting media institutions. Such partnerships are established under conditions of common interest and trust, which is part of the LDS mission, targeting such alliances at a European level.

In this context, the LDS is currently collecting requirements from its potential users while establishing synergies with such institutions to encourage the broader sharing of their language data under legally compliant and transparent conditions of trust.

The importance of data spaces has also seen the light throughout the workshops, with increased awareness in recent events emphasising their key role in ensuring the availability of data for the different communities and guaranteeing that data holders and producers remain in control of their data. LDS plays a crucial role with respect to both data users and providers and is highly active within the data space ecosystem.

Takeaways and insights

Increased knowledge about the LDS has been confirmed in the past few months; the initiative’s work and its first prototype are better known at this stage and stakeholders are confident that the LDS will offer a trustworthy and legally compliant marketplace for data users and providers.

In addition to the above, these workshops have also witnessed lively discussions around a wide range of topics, some of which are the use of synthetic data, the importance of preserving the European cultural identity through data, the need to carry out awareness-raising work to convince and support companies in sharing their data, the need for clear legal support and best practices to help share data in a compliant manner, also considering anonymisation and privacy preserving approaches. All these will be certainly discussed further in the series of upcoming country workshops.

A big thank you to all country workshop organisers for the great work they have done!