Research & Innovation - Participant Portal


TOPIC : Managing, preserving and computing with big research data

Topic identifier: EINFRA-1-2014
Publication date: 11 December 2013

Types of action: RIA Research and Innovation action
Opening date:
11 December 2013
Deadline: 02 September 2014 17:00:00

Time Zone : (Brussels time)
  Horizon 2020
Topic Description

Specific challenge: Development and deployment of integrated, secure, permanent, on-demand service-driven, privacy-compliant and sustainable e-infrastructures incorporating advanced computing resources and software are essential in order to increase the capacity to manage, store and analyse extremely large, heterogeneous and complex datasets[1], including text mining of large corpora. These e-infrastructures need to provide services cutting across a wide-range of scientific communities and addressing a diversity of computational requirements, legal constraints and requirements, system and service architectures, formats, types, vocabularies and legacy practices of scientific communities that generate, analyse and use the data.

Scope: Proposals should address at least one of the first five (5) activities, or activities 6, 7 or 8 individually. Proposers are encouraged to leverage on prior work on open prototype services and to use discoverable service catalogues, common APIs, service-level agreements (SLAs) and transparent billing.

(1) Establishing a federated pan-European data e-infrastructure to provide cost-effective and interoperable solutions for data management and long term preservation. The needs for data access, storage, replication, annotation, search, compute, analysis and reuse of information across disciplines should be accommodated in different research and education contexts.  All these functions should expose standard interfaces for interoperation with other data sources to aggregate them or to be aggregated, considering also ethical and regulatory requirements for sensitive data (e.g. patient data). Sustainability is of paramount importance, therefore robust business models should be proposed to encourage investment from all stakeholders. Foreseen challenges are technical, legal and organisational, including engaging e-infrastructure operators and other service providers (such as those receiving support under topics EINFRA-2-2014, EINFRA-3-2014, and EINFRA-7-2014);

(2) Services to ensure the quality and reliability of the e-infrastructure, including certification mechanisms for repositories and certification services to test and benchmark capabilities in terms of resilience and service continuity of e-infrastructures;

(3) Federating institutional and, if possible, private data management and curation tools and services used across or at some point of the full data lifecycle, including approaches for identification of open data sources and data collected with sensitive or restricted access features. Services and tools should be federated on the basis of an open architecture and should offer or coordinate support to the development of Data Management Plans, in particular for Horizon 2020 project participants;

(4) Large scale virtualisation of data/compute centre resources to achieve on-demand compute capacities, improve flexibility for data analysis and avoid unnecessary costly large data transfers.

(5) Development and adoption of a standards-based computing platform (with open software stack) that can be deployed on different hardware and e-infrastructures (such as clouds providing infrastructure-as-a-service (IaaS), HPC, grid infrastructures…) to abstract application development and execution from available (possibly remote) computing systems. This platform should be capable of federating multiple commercial and/or public cloud resources or services and deliver Platform-as-a-Service (PaaS) adapted to the scientific community with a short learning curve. Adequate coordination and interoperability with existing e-infrastructures (including GÉANT, EGI, PRACE and others) is recommended

(6) Support to the evolution of EGI (European Grid Infrastructure) towards a flexible compute/data infrastructure capable of federating and enabling the sharing of resources of any kind (public or private, grid or cloud, etc.) in order to offer computing and storage services to the whole European scientific community. The proposal will address operations for supplying services (IaaS, PaaS, SaaS) at European level, engagement of and tailoring of services to new user communities and dissemination activities.

(7) Proof of concept and prototypes of data infrastructure-enabling software (e.g. for databases and data mining) for extremely large or highly heterogeneous data sets scaling to zetabytes and trillion of objects. Clean slate approaches to data management targeting 2020+ 'data factory' requirements of research communities and large scale facilities (e.g. ESFRI projects) are encouraged.

(8) Enable the creation of a platform and infrastructure for mining text aggregated from different sources/publishers that responds to the needs of users (researchers). This includes the definition of technical requirements (e.g. on interoperability, metadata standards and aggregation of new services) as well as addressing legal and contractual issues to serve the needs of text mining communities. The project should also provide consulting and counselling services to solve problems related with the legal framework and permissions to text mine collections, and to advise researchers on the benefits and practice of text mining. The development of the proposed platform and services should be informed by the studies on policy and licencing issues associated with Text and Data Mining that will be funded from the Call for “Developing governance for the advancement of Responsible Research and Innovation” in the "Science with and for Society" Work Programme (topic GARRI.3.2014 - Scientific Information in the Digital Age: Text and Data Mining). Therefore, the successful proposals in these two calls are expected to engage in a mutual dialogue and establish synergies in their work.

A maximum of EUR 8 million of the total budget for this topic is foreseen for activity (6).  

This topic is complementary with topic INFRADEV-4-2014/2015, as it addresses services that are potentially transversal and generic, whereas INFRADEV-4-2014/2015 addresses interoperability of services and common solutions for cluster of ESFRI and other research infrastructure initiatives in thematic areas.

Expected impact

         Increased availability of scientific data for scientific communities independently of them having already embraced or not e-science; this will be measured by cross-border data traffic over the research networks in Europe as a proxy.

         Better optimisation of the use of IT equipment for research.

         Avoiding lock-in to particular hardware or software platforms in the development of science.

         Scientific communities embrace storage and computing infrastructures as state-of-the-art services become available and the learning curve for their use becomes less steep; this will be measured by the storage capacity available for pan-European use as well as by the number of users of EGI and other production e-infrastructures in this area.

         Through the development of large pooled and interoperable text mining infrastructures, efficiencies of scale will reduce the overall costs, and more open licensing schemes will spread the use of such licenses and boost the exchange of text mining resources and practices.

Type of action: Research and innovation actions  

[1]     Research data include large datasets collected, developed or generated for/by research, integration of small distributed datasets, as well as data not originally collected for research, which may include environmental, social and humanities data.

Cross-cutting Priorities:

Socio-economic science and humanities

Topic conditions and documents

Please read carefully all provisions below before the preparation of your application.

1. List of countries and applicable rules for funding: described in part A of the General Annexes of the General Work Programme.

2. Eligibility and admissibility conditions: described in part B and C of the General Annexes of the General Work Programme.

3. Evaluation

3.1 Evaluation criteria and procedure, scoring and threshold: described in part H of the General Annexes of the General Work Programme, with the following exceptions:
For the criterion Excellence, in addition to its standard sub-criteria, the following aspects will also be taken into account:
- The extent to which the Networking Activities will foster a culture of co-operation between the participants and other relevant stakeholders;
- The extent to which the Service activities will offer access to state-of-the-art infrastructures, high quality services, and will enable users to conduct excellent research;
- The extent to which the Joint Research Activities will contribute to quantitative and qualitative improvements of the services provided by the infrastructures.

3.2 Guide to the submission and evaluation process

3.3 Specific arrangements for the evaluation of Call H2020-EINFRA-2014-2:
See Part E of the Specific features for Research Infrastructures of the Horizon 2020 European Research Infrastructures (including e-Infrastructures) Work Programme 2014-2015 for the specific proposal conditions.

4. Proposal page limits and layout: Please refer to the specific proposal template Part B.

5. Indicative timetable for evaluation and grant agreement:
Information on the outcome of evaluation: maximum 5 months from the final date for submission.
Signature of grant agreements: maximum 3 months from the date of informing applicants they have been successful.

6. Provisions, proposal templates and evaluation forms for the type(s) of action(s) under this topic:
Research and Innovation Action:

Specific provisions and funding rates
Specific proposal template
Specific evaluation form
Annotated Model Grant Agreement

7. Additional provisions:
Horizon 2020 budget flexibility

8. Open access must be granted to all scientific publications resulting from Horizon 2020 actions, and proposals must refer to measures envisaged. Where relevant, proposals should also provide information on how the participants will manage the research data generated and/or collected during the project, such as details on what types of data the project will generate, whether and how this data will be exploited or made accessible for verification and re-use, and how it will be curated and preserved.

Additional documents

  • H2020-EINFRA-2014-2 - flash call info en
  • Legal basis - Framework Programme H2020 en
  • Legal basis - Specific Programme H2020 en
  • WP H2020 - 1. Introduction en
  • WP H2020 - 4. Research Infrastructures (including e-Infrastructures) en
  • WP H2020 - 19. General Annexes en

Submission Service

No submission system is open for this topic.

Get support

National Contact Points (NCP) – contact your NCP for further assistance.

Enterprise Europe Network – contact your EEN national contact point for advice to businesses with special focus on SMEs. The support includes guidance on the EU research funding.

Research Enquiry Service – ask questions about any aspect of European research in general and the EU Research Framework Programmes in particular.

IT Helpdesk – contact the Participant Portal IT helpdesk for questions such as forgotten passwords, access rights and roles, technical aspects of submission of proposals, etc.

Ethics – to ensure compliance with ethical issues, further information is available on the Participant Portal and on the Science and Society Portal.

European IPR Helpdesk assists you on intellectual property issues.

The European Charter for Researchers and the Code of Conduct for their recruitment

CEN and CENELEC, the European Standards Organisations, advise you how to tackle standardisation in your project proposal. Contact CEN-CENELEC Research Helpdesk at

Partner Search Services helps you find a partner organisation for your proposal.

H2020 Online Manual your online guide on the procedures from proposal submission to managing your grant.