Knowledge for policy
Topic

Data: a cornerstone for AI – Toward a Common European Data Space

For an application of AI to be ready for market entry it has to learn on the basis of training data. Additionally, it may need further data sources in order…

Good quality shared data is essential to develop socially responsive AI

For an application of artificial intelligence (AI) to be ready for market entry it has to learn on the basis of training data. Once in use on the market, it should generate a sufficient amount of data as part of its use.

Training data need to pre-exist the entry into market. They may be available in the company developing the application, but this may also not be the case. In this case, the access to relevant training data may pose a barrier to entry to the market. At the same time, the generation of high quality annotated training data requires an investment that needs to be recovered. Ideally, access to relevant training data held by another company is given in a market-conform manner on the basis of private contracts.

Particularly relevant is the case of several operators in the same industry that could jointly develop an AI application beneficial to all of them, but only after assembling the required amount of training data by pooling their data assets. 

Additionally, an AI application may need complementary data not generated as part of its usage on a continuous basis, for example environmental information or mapping information. 

With its actions, the EC seeks to support wider availability of relevant data. One line of action is to improve the supply of data held by the public sector and data generated in the course of publicly funded research. In addition to that, it seeks to encourage disclosure of the availability of relevant data by companies as a pre-condition for other companies to enter into contractual negotiations on its use. It also seeks to ensure fair competition with respect to data holdings.