Digital Agenda for Europe
A Europe 2020 Initiative

Project factsheets - Language Technologies

The language technologies portfolio includes projects from the 7th Framework Programme (FP7) and the Competitive and Innovation programme - ICT-policy support programme.

 The full range of topics covered are:

  • automated translation,
  • multilingual content authoring and management,
  • speech technology and interactive services,
  • content analytics,
  • language resources, and
  • collaborative platforms.
EU Investments
EU Investments
  • A European Community of SMEs built on Environmental Digital Content and Languages

    The INSPIRE Directive 2007/2/EC, establishes an Infrastructure for Spatial Information in Europe, requiring large amounts of environmental digital content to be made accessible across Europe, resulting in a data pool that is expected to be of huge value for a myriad of value-added applications...
  • A service platform for aggregation, processing and analysis of urban and regional planning data

    Urban and Regional Planning data sets are not aggregated so far, and thus it is very difficult to use them for any other purpose than for printing of simple publishing by the authorities that they were created by...
  • Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation

    Lack of sufficient linguistic resources for many languages and domains currently is one of the major obstacle in further advancement of automated translation...
  • Annotation Resource Marketplace in the Cloud

    ANNOMARKET aims to revolutionise the text annotation market, by delivering an affordable, open market place for pay-as-you-go, cloud-based extraction resources and services, in multiple languages. This project is is driven by a commercially-dominated consortium, from 3 EU countries and with 43% of the budget assigned to SMEs.<br/>The key differentiating feature of ANNOMARKET is its open marketplace concept...
  • Automated Community Content Editing PorTal

    The use of machine translation (MT) is becoming much more pervasive. At the same time, Web 2.0 paradigms are democratising content creation - stressing the value of communities of users creating content for each other. However, right now these two trends are fairly incompatible. MT engines, even statistical engines, cannot produce acceptable results for community content due to the extreme variability within the content...
  • Bridges Across the Language Divide

    Today Europe is facing larger and more critical language challenges than ever before. The production of multilingual content now far outpaces our ability to translate it by human effort and we must turn to automatic methods to cope. Thus, effective and innovative alternatives must be provided to Europe's citizens and businesses...
  • Cognitive Analysis and Statistical Methods for Advanced Computer Aided Translation

    The CASMACAT project will build the next generation translator's workbench to improve productivity, quality, and work practices in the translation industry.<br/>We will carry out cognitive studies of actual unaltered translator behaviour based on key logging and eye tracking...
  • Commercially empowered Linked Open Data Ecosystems in Research

    Linked Open Data (LOD) shows enormous potential in becoming the next big evolutionary step of the WWW. However, this potential remains largely untapped due to missing usage and monetisation strategies.CODE's vision is to establish the foundation for a web-based, commercially oriented ecosystem for Linked Open Data. This ecosystem establishes a sustainable and commercial value-creation-chain among traditional (e.g. data provider and consumer) and non-traditional (e.g...
  • Computing Veracity Across Media, Languages, and Social Networks

    Social media poses three major computational challenges, dubbed by Gartner the 3Vs of big data: volume, velocity, and variety. Content analytics methods have faced additional difficulties, arising from the short, noisy, and strongly contextualised nature of social media. In order to address the 3Vs of social media, new language technologies have emerged, e.g...
  • Cross-lingual Knowledge Extraction

    The goal of the X-LIKE project is to develop technology to monitor and aggregate knowledge that is currently spread across global mainstream and social media, and to enable cross-lingual services for publishers, media monitoring and business intelligence.In terms of research contributions, the aim is to combine scientific insights from several scientific areas to contribute in the area of cross-lingual text understanding...
  • Data Supply Chains for Pools, Services and Analytics in Economics and Finance

    DOPA will enable European SMEs to become key players in the global data economy, as the impacts of the DOPA RTD activities will likely materialize in both the supply and in the demand side of B2B vertical market segments of the data related services.\nDOPA has been designed to provide solutions to gaps observed in the EU environment, pointing out the need to develop and implement a source and exploitation platform for economic and financial information in Europe that would altogether:\n- Have th..
  • Democratizing Fleet Management

    GPS positioning devices are becoming a commodity sensor platform with the emergence and popularity of smartphones and ubiquitous networking. While the positioning capability has been exploited in location-based services, so has its spatiotemporal cousin, tracking, so far only been considered in costly and complex fleet management applications...
  • Distant-speech Interaction for Robust Home Applications

    The DIRHA project addresses the development of voice-enabled automated home environments based on distant-speech interaction in different languages. A distributed microphone network is installed in the rooms of a house in order to monitor selectively acoustic and speech activities observable inside any space, and to eventually run a spoken dialogue session with a given user in order to implement a service or to have access to appliances and other devices...
  • Educational curriculum for the usage of Linked Data

    Linked Data has established itself as the de facto means for the publication of structured data over the Web, enjoying amazing growth in terms of the number of organizations committing to use its core principles for exposing and interlinking data sets for seamless exchange, integration, and reuse. More and more ICT ventures offer innovative data management services on top of Linked (Open) Data, creating a demand for data practitioners possessing skills and detailed knowledge in this area...
  • EUMSSI- Event Understanding through Multimodal Social Stream Interpretation

    The main objective of EUMSSI is developing technologies for identifying and aggregating data presented as unstructured information in sources of very different nature (video, image, audio, speech, text and social context), including both online (e.g., YouTube) and traditional media (e.g. audiovisual repositories), and for dealing with information of very different degrees of granularity...
  • EXploring Customer Interactions through Textual EntailMENT

    Identifying semantic inference relations between texts is a major underlying language processing task, needed in practically all text understanding applications. For example, Question Answering and Information Extraction systems should verify that extracted answers and relations are indeed inferred from the text passages; multi-document text summarization needs to infer that one sentence entails another in order to avoid redundantly including both in a summary; and so on...
  • Feedback Analysis for User adaptive Statistical Translation

    The FAUST project will develop machine translation (MT) systems which respond rapidly and intelligently to user feedback. Current web-based MT systems provide high-volume translation without real-time. Most systems provide no opportunity for users to offer opinions or corrections for translation results. Other systems ask users for feedback on translation, however the user does not see any benefit to providing feedback: the translation does not change in response to the feedback...
  • Fusing and pooling information for product/service development and research (Fusepool)

    Fusepool develops an user-adaptive «Living Knowledge Pool» for product development and research. Compared to existing search and knowledge management solutions, Fusepool provides two core benefits: the automated transformation of content from web-harvesting and participating organizations into structured Linked Open Data format and the automated group-specific optimization of knowledge finding and matching based on transfer learning from individual users...
  • Get Home Safe: Extended Multimodal Search and Communication Systems for Safe In-Car Application

    The aim of the proposed project is to develop a system for safe information access (search, navigation, point of interest) and communication (texting) while driving. In order to reach that goal, we approach the problem from a holistic view, investigating the underlying "driving forces", studying the goals underlying searching and texting. Special attention will be paid to task and context factors such as multi-tasking and the associated cognitive load for the driver...
  • GNSS DAta Pool for PerFormances PredIction and SimuLation of New AppLications for DevelopERs

    With the increase of criticality level of the GPS applications, products manufacturers and developers are now requesting a relevant mean to predict performances and reliability of their future applications...