 |
GENOMICS
Maritime secrets added to biological repository
A multinational biological
information consortium, UniProt, has added a new database repository to
its family of protein sequence databases. Such protein sequence
databases are a crucial resource for molecular biologists. Information
accumulated in this database is central to biological research, because
of the functions that these molecules carry out in cells and their
crucial roles in disease processes.
 |
Nucleosome.
Proteins' basic amino acids (left, blue)
bind to the acidic phosphate groups on DNA (right, red). |
|
Proteomics research, the large-scale study of proteins and their
interactions, has accelerated in recent years because of technological
advances in protein science and the large amounts of genomic data
pouring out of the Human Genome Project (HGP). The UniProt consortium
aims to support biological research by maintaining a high quality
database that serves as a stable, comprehensive, fully classified,
richly and accurately annotated protein sequence knowledge base, with
extensive cross-references and querying interfaces freely accessible to
the scientific community.
In a major leap forward for researchers everywhere, UniProt has added a
new database repository for metagenomic and environmental data to its
family of protein sequence databases. Metagenomics is the large-scale
genomic analysis of microbes recovered from environmental samples, as
opposed to laboratory-grown organisms which represent only a small
proportion of the microbial world.
The UniProt Consortium comprises the European Molecular Biology
Laboratory’s European Bioinformatics Institute (EMBL-EBI), the
Swiss Institute of Bioinformatics (SIB), and the Protein Information
Resource (PIR) hosted by the National Biomedical Research Foundation
(NBRF) at the Georgetown University Medical Center in Washington, D.C.,
USA.
The UniProt Metagenomic and Environmental Sequences (UniMES) database
currently contains the data from the Global Ocean Sampling Expedition
(GOS), which was originally submitted to the International Nucleotide
Sequence Databases (INSDC).
The initial GOS dataset is composed of 28 million DNA sequences from
oceanic microbes and it predicts nearly 6 million proteins. By
combining the predicted protein sequences with automatic classification
by InterPro, the EBI’s integrated resource for protein families,
domains and functional sites, UniMES uniquely provides free access to
the array of genomic information gathered from sampling expeditions,
enhanced by links to further analytical resources. Genomics holds the
key to understanding a significant part of the world around us, and the
metagenomic and environmental data represent a step forward in further
charting genomic diversity.
With the increasing volume and variety of protein sequences and
functional information that has become available, UniProt effectively
serves as the central database of protein sequence and function. It has
become a cornerstone for a wide range of scientists active in modern
biological research, especially in the field of proteomics. Researchers
working at the PIR site have also made great strides in automating the
use of computers to analyse proteins.
As a publicly funded project, UniProt's data is freely accessible and
all data is released in a timely manner. The website created for
UniProt effectively fulfils this role.
|
 |

More information:
The Universal Protein Resource
European Molecular Biology Laboratory
National Human Genome Research Institute
Integrated Protein Informatics Resource for Genome and Proteomic Research
|
 |