EU Science Hub

Semantic Text Analysis tool: SeTA

An ever-growing number and length of documents, number and depth of topics covered by legislation, and ever new phrases and their slowly changing meaning, these are all contributing factors that make policy analysis more and more complex. As implication, human policy analysts and policy developers face increasing entanglement of both content and semantical levels. To overcome several of these issues, JRC has developed a central pilot tool called AI-KEAPA to support policy analysis and development in any domain. Recent developments in big data, machine learning and especially in natural language processing allow converting unfathomable complexity of many hundreds of thousands of documents into a normalised high-dimensional vector space preserving the knowledge. Unstructured text in document corpora and big data sources, until recently considered just an archive, is quickly becoming core source of analytical information using text mining methods to extract qualitative and quantitative data. Semantic analysis allows us to extract better information for policy analysis from metadata titles and abstracts than from the structured human-entered descriptions. This digital assistant allows document search and extraction over many different sources, discovery of phrase meaning, context and temporal development. It can recommend most relevant documents including their semantic and temporal interdependencies. But most importantly, it helps bursting knowledge bubbles and fast-learning new domains. This way we hope to mainstream artificial intelligence into policy support. The tool is now fit for purpose. It was thoroughly tested in real-life conditions for about two years mainly in the area of legislative impact assessments for policy formulation, and other domains such as large data infrastructure analysis, agri-environmental measures or natural disasters, some of which are detailed in this document. This approach boosts the strategic JRC focus on application of scientific analysis and development. This service adds to the JRC competence and central position in semantic reasoning for policy analysis, active information recommendation, and inferred knowledge in policy design and development.