Sharing and comparing to defeat emerging diseases
When the nature of a disease is unknown, it is difficult to be prepared. Think Ebola, avian influenza or SARS - news reports of their spread were regularly accompanied by updates on the frantic hunt for vaccines and treatments. EU-funded researchers hope to stay one step ahead from now on with a new platform for detecting and analysing outbreaks.
Evidence suggests that the rapid exchange and comparison of data is essential to identifying and tackling both emerging and re-emerging diseases, as well as foodborne disease outbreaks. As the number of global outbreaks continues to climb – infectious diseases are currently the cause of 23% of all deaths worldwide – a team of EU-funded researchers is seeking to overcome old barriers to sharing information, and to gather all data (pathogenic as well as clinical and epidemiological) in one place.
“Until now, we have been writing history,” explains COMPARE project coordinator Frank Aarestrup of the Technical University of Denmark. “We have been describing outbreaks, but not preventing human cases.”
Obstacles to a common platform are physical, procedural, technical, legal and even linguistic. And when Aarestrup refers to language, he’s not just talking about different mother tongues or different terminology between scientific disciplines: “You wouldn’t believe the number of ways people spell salmonella!”
The arrival of the sequencer to describe the exact order of genes within a DNA molecule has overcome some language problems – “DNA is all As, Cs, Ts and Gs,” says Aarestrup.
But sequencers also strengthen the case for greater collaboration – the data they supply need interpreting, and it makes no sense for every individual laboratory to recruit a specialised bioinformatician capable of doing this.
The COMPARE solution is a global service that not only interprets data, but shares it with the authorities when relevant so that they may take action to forestall or tackle a disease outbreak.
Storing information centrally also becomes more practical in the era of big data – individual laboratories cannot afford the processing power to deal with hundreds of thousands of sequences, or indeed the space to store them.
A question of privacy
Other challenges include the different ways in which national reference laboratories and clinical labs are used to working, and identifying a data-sharing approach that protects privacy but provides enough information – and fast enough – to be helpful. Should extensive information be sent out to a fairly closed network, perhaps on a weekly basis? Or should more limited information be made public in real time? “We are trying to find a flexible solution that is somewhere between the two approaches,” explains Aarestrup.
Setting boundaries on confidentiality may also be a concern for companies. If a foodborne outbreak is traced back to a product for which a company provided sequencing data 10 years ago, the company’s reputation would be damaged, even if the company is no longer using the same supplier.
“This is a true and right concern, and if we need information from these types of data providers, we need to respect these issues,” says Aarestrup.
Against this background, the project also foresees different sites for private, shared and public data, and will define which data sets should go where.
Step forward biomathematicians
In addition to data, the platform is also conceived as an access point for tools. COMPARE is identifying the most effective for each stage of the workflow, it will also be possible for any biomathematician to upload his or her tools. “We don’t believe we’re the Masters of the Universe,” says Aarestrup by way of explanation. “Why not let others upload their tools? If theirs are better, we will throw out what we have done. We are taking advantage of the global research community.”
Less than one year into the project, the team has already defined sampling strategies and protocols, while some tools are already online. “There are 1 000 users daily, so we can see a clear need for these tools,” says Aarestrup.
The tools will stay in place after the project ends in 2019, becoming part of the EMBL’s European Nucleotide Archive.