Standardisation in the area of innovation and technological development, notably in the field of Text and Data Mining

  • Martine Grosjean profile
    Martine Grosjean
    29 April 2015 - updated 4 years ago
    Total votes: 0
European Commission
Year of publication: 

Text and data mining (TDM) is an important technique for analysing and extracting new insights and knowledge from the exponentially increasing store of digital data (‘Big Data’). It is important to understand the extent to which the EU’s current legal framework encourages or obstructs this new form of research and to assess the scale of the economic issues at stake.
TDM is useful to researchers of all kinds, from historians to medical experts, and its methods are relevant to organisations throughout the public and private sectors. Because TDM research technology is not prohibitively expensive, it is readily available to lone entrepreneurs, individual post-graduate students, start-ups and small firms. It is also amenable to playful and highly speculative uses, enabling research connections between previously unconnected fields. There is growing recognition that we are at the threshold of the mass automation of service industries (automation of thinking) comparable with the robotic automation of manufacturing production lines (automation of muscle) in an earlier era. TDM will be widely used to provide insights in the re-design of this digital services economy.
When it comes to the deployment of TDM, there are worrying signs that European researchers may be falling behind, especially with regard to researchers in the United States. Researchers in Europe believe that this results, at least in part, from the nature of Europe’s laws with regard to copyright, database protection and, perhaps increasingly, data privacy. In the United States, the ‘fair use’ defence against copyright infringement appears to offer greater re-assurance to researchers than the comparable copyright framework in Europe, which relies upon a closed set of statutory exceptions. Recent court decisions, for example in the ten-year old ‘Google Books’ case, appear to confirm this. The US has no equivalent of Europe’s database protection laws.
In Europe, there are signs of a response among publishers to encourage wider use of TDM. Scientific publishers have recently proposed licensing terms designed to make TDM of their own archives easier, but many researchers dismiss these efforts as insufficient, arguing that ‘the right to read is the right to mine’ and that effective research demands freedom to mine all public domain databases without restriction. These pressures from researchers have increased as a result of a growing move to ‘Open Access’ scientific publishing in Europe and elsewhere. The UK and Ireland have already committed themselves to more permissive copyright rules with regard to TDM.