Bioinformatics provides researchers with efficient computer-based tools for storing, retrieving and interpreting the vast quantities of data being generated by research into functional genomics. By making these data available to the academic and industrial research communities in an accessible and usable form, bioinformatics ensures that the potential for genomics research to benefit human health is maximised.
In their rawest form, genomic data have little real value. For the data to become biologically meaningful, researchers need to be able to define and interpret the features of a particular genome and store this information in a form which can be shared and understood by other members of the research community – a process called genome annotation.
Using a combination of computational tools and biological knowledge derived from experiments, annotation involves a detailed analysis of DNA samples, comparing new DNA sequences with known sequences, identifying common characteristics, and assigning known or potential functions to sections of the DNA sequence and its products.
Data from genomics research are becoming overwhelming both in their quantity and their complexity. New techniques using microarrays, proteomics and structural genomics approaches are generating data with multiple dimensions representing the complex interactions that occur between different molecular elements in the cell, both in time and space. Just a single microarray experiment can yield information on the expression patterns of thousands of genes.
To ensure that the huge potential of new genomic technologies, and the data they generate, is harnessed efficiently, bioinformaticians are developing new ‘data mining’ tools which can sift through the mountains of raw data, extract relevant information, identify patterns and generate explanatory and predictive models – for gene function or protein structure, for example. As these tools, in turn, produce their own predictions which need to be tested and validated experimentally, an important part of bioinformatics is feedback of validated results into the system to constantly improve the methods being used.
The frantic activity of the first years of genomic research led to many different genomics databases – each dedicated to a particular area of genomics, or a particular organism – springing up all over the world. As a result, bioinformatics research has become somewhat fragmented. The new bioinformatics tools will ensure that these different sources are fully integrated and searchable.
Powerful data mining tools combined with enhanced genome annotation, incorporating a range of concepts and observations from different technologies and different database sources, will enable researchers to ask increasingly complex questions about data at many different levels, and to make important connections between initially disparate observations and information, and between organisms.
Future developments in bioinformatics research are likely to result in a closer relationship developing between molecular biologists involved in genome experiments and bioinformaticians, as bioinformatics moves on from being a way of storing information to a fundamental research tool in its own right.
The ability of computers to model or simulate the fundamental biological processes that keep us alive, or the interaction of the many genetic and environmental factors underlying complex diseases, such as asthma and diabetes, will be vital in the discovery and development of the next generation of drugs, medical diagnostics, and therapies. Making these computer-based technologies accessible to the widest possible scientific community, along with guidelines on their most appropriate and efficient use, will open up new avenues of research and ensure that no opportunities are missed to transform genomic data into essential knowledge for human health.
Network of Excellence