Too much information: better ways to manage data

Humanity is generating ever-increasing amounts of data with genome sequencing and internet use, faster than our computers can handle. An EU-funded project is designing storage and analysis solutions which can help optimise transport networks and advance research into diseases and personalised medicine.

Countries
Countries
  Algeria
  Argentina
  Australia
  Austria
  Bangladesh
  Belarus
  Belgium
  Benin
  Bolivia
  Bosnia and Herzegovina
  Brazil
  Bulgaria
  Burkina Faso
  Cambodia
  Cameroon
  Canada
  Cape Verde
  Chile
  China
  Colombia
  Costa Rica
  Croatia
  Cyprus
  Czechia
  Denmark
  Ecuador
  Egypt
  Estonia
  Ethiopia
  Faroe Islands
  Finland
  France
  French Polynesia
  Georgia

Countries
Countries
  Algeria
  Argentina
  Australia
  Austria
  Bangladesh
  Belarus
  Belgium
  Benin
  Bolivia
  Bosnia and Herzegovina
  Brazil
  Bulgaria
  Burkina Faso
  Cambodia
  Cameroon
  Canada
  Cape Verde
  Chile
  China
  Colombia
  Costa Rica
  Croatia
  Cyprus
  Czechia
  Denmark
  Ecuador
  Egypt
  Estonia
  Ethiopia
  Faroe Islands
  Finland
  France
  French Polynesia
  Georgia


 

Published: 20 May 2019  
Related theme(s) and subtheme(s)
Information societyInformation technology  |  Internet
Innovation
International cooperation
Research policyHorizon 2020
Science in societyFuture science & technology
Countries involved in the project described in the article
Australia  |  Chile  |  Finland  |  Japan  |  Portugal  |  Spain
Add to PDF "basket"

Too much information: better ways to manage data

image

© Sikov #222912660, 2019 source: stock.adobe.com

The emergence of cheaper, better technology in the field of molecular biology is enabling the sequencing of the genomes of millions of individual humans and other animal species. This generates enormous volumes of data, and the ability to analyse it is critical to understanding biological organisms and treating diseases.

Similarly, the volume of information on the web is increasing as, either consciously or subconsciously, we generate it in our daily lives through clicks, likes, searches, downloads, uploads, and even our mere connection to the web. The sheer volume of all this data poses challenges to the current computational storage, management and indexing systems.

The EU-funded BIRDS project is working on the problem by designing new structures that compress data, indexes and algorithms to provide better storage, processing and querying of large volumes of data. One approach is by taking advantage of the repetitiveness of and patterns in the data.

Some members of the project team have obtained important results for indexing highly repetitive texts, such as genomic databases, which contain information on an organism’s DNA.

‘We expect these results will have a great impact on current bioinformatic software, such as Bowtie – a software package that can align and analyse sequences in bioinformatics – and on building indexes for large genomic databases efficiently,’ says project coordinator Susana Ladra of the University of A Coruña, Spain.

‘This can revolutionise the field of bioinformatics and help, for example, with the discovery of rare diseases.’

In the case of the analysis of biological sequences, new algorithms can be used to identify mutations and gene rearrangements present in cancer genomes, which can be essential for understanding the disease and developing targeted therapeutics, says Ladra.

The project is focusing on three lines of research: algorithms for sequence analysis, compression and indexing techniques for repetitive data, and data structures and algorithms for network analysis.

More efficient transport

Some researchers from different partners are working together on storing large amounts of data with spatio-temporal information – such as the position of moving objects like boats and planes – then locating a specific object among all this data. These challenges can be solved using structures which compress the data and keep an index enabling access to the information without having to decompress everything.

Another project team is working on real transportation problems, with a focus on the public transport systems in Santiago, Chile and Madrid, Spain. Using compressed representations for journeys over previously known networks, they are seeking solutions for vehicle route planning, addressing the needs expressed by some private companies.

Commercialising results

The idea of using compact data structures arose when researchers noted the similarity between most object or trajectory movements in transportation systems, making this information highly compressible.

Having studied the different kinds of queries that can be solved with classic spatio-temporal indexes, they designed compact data structures to solve these queries efficiently and by using less space. Extra information and new functions can be added to these data structures, depending on the requirements.

‘Results on trajectories can be commercialised and used by airlines, for example, to know how they can optimise routes, where they can save fuel, or which flights had problems on their routes,’ says Ladra.

Information on vehicle trajectories can also be used to track ships, detect fishing in a prohibited area, determine which routes are more popular, or pinpoint those that can be improved.

The project was funded through the Marie Skłodowska-Curie Research and Innovation Staff Exchange (RISE).

One project goal is to increase the number of new researchers attracted to this field on an international scale and improve the education of PhD candidates and postdoctoral researchers. ‘We expect better research will be carried out in Europe thanks to RISE funding,’ concludes Ladra.

The BIRDS project involves seven research institutions: University of Melbourne, Australia; University of Chile, and University of Concepción, Chile; University of Helsinki, Finland; Kyushu University, Japan; Instituto de Engenharia de Sistemas e Computadores, Investigação e Desenvolvimento em Lisboa, Portugal; and the University of A Coruña, alongside a Spanish SME Enxenio SL.

Project details

  • Project acronym: BIRDS
  • Participants: Spain (Coordinator), Finland, Portugal, Chile, Austria, Japan
  • Project N°: 690941
  • Total costs: € 648 000
  • EU contribution: € 648 000
  • Duration: January 2016 to December 2019

See also

 

Convert article(s) to PDF

No article selected


loading


Search articles

Notes:
To restrict search results to articles in the Information Centre, i.e. this site, use this search box rather than the one at the top of the page.

After searching, you can expand the results to include the whole Research and Innovation web site, or another section of it, or all Europa, afterwards without searching again.

Please note that new content may take a few days to be indexed by the search engine and therefore to appear in the results.

Print Version
Share this article
See also
Project website
Project details