Search This Blog

Description of few important databases



PubMed Central (Bibliographic databases)


PubMed Central (PMC) is a digital archive provides access to more than 400,000 full-text articles. PMC is a free archive of biomedical and life sciences peer reviewed journal literature at the U.S. National Institutes of Health's National Library of Medicine (NIH/NLM)

PubMed comprises more than 22 million citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full-text content from PubMed Central and publisher web sites.

PMC is not a publisher and does not publish journal articles itself. However, PMC is a repository for journal literature deposited by participating publishers, as well as for author manuscripts that have been submitted in compliance with the Public Access Policy mandated by NIH.

The value of PMC lies not in its capacity to store, but cross-reference data from diverse sources using a common format within a single repository. Besides, with PMC, a user can quickly search the entire collection of full-text articles and locate all relevant material using the Entrenz program of NCBI.

PMC also allows for the integration of its literature with a variety of other information resources that can enhance the research and knowledge fields of scientists, clinicians and others.



PDB Database (Structural database)
The Protein Data Bank (PDB) archive is the single worldwide repository of information about the 3D structures of large biological molecules, including proteins and nucleic acids.
These molecules are included in the database from various sources such as: bacteria, yeast, plants, flies, other animals, and humans etc.

The PDB was established in 1971 at Brookhaven National Laboratory and originally contained 7 structures. In 1998, the Research Collaboratory for Structural Bioinformatics (RCSB) became responsible for the management of the PDB. In 2003, the wwPDB was formed to maintain a single PDB archive of macromolecular structural data that is freely and publicly available to the global community. It consists of organizations that act as deposition, data processing and distribution centers for PDB data.

The PDB archive is freely available and is updated each week at the target time of Wednesday 00:00 UTC (Coordinated Universal Time). The most recent release is time stamped and linked on every page in the top right header.

The main aim of the PDB database is to provide protein structural knowledge can be used to help deduce a structure's role in human health and disease, and in drug development. Since, understanding the shape of a molecule helps to understand how it works. The structures in the database range from tiny proteins and bits of DNA to complex molecular machines like the ribosome.

The data i.e. 3-D structural data of large biological molecules, such as proteins and nucleic acids, typically obtained by X-ray crystallography or NMR spectroscopy and submitted by biologists and biochemists from around the world.

The contents of the PDB are thought of as primary data, then there are hundreds of derived (i.e., secondary) databases that categorize the data differently. For example, both SCOP and CATH categorize structures according to type of structure and assumed evolutionary relations.

The PDB can be accessed through the website www.rcsb.org/pdb, and the RCSB (Research Collaborators for Structural Bioinformatics) is one of the sister organizations like “PDBj, PDBe“ that provides and maintains the access to PDB database.

However, the main aim of RCSB is to be dedicated to improving our understanding of the function of biological systems through the study of the 3-D structure of biological macromolecules.

KEGG PATHWAY Database:
KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway Database can be publicly accessed by the users at http://www.genome.ad.jp/kegg/.

KEGG was initiated by the Japanese human genome programme in 1995 and basically KEGG database was a collection of manually drawn pathway maps representing our knowledge on the molecular interactions and reaction networks.

Today, KEGG has evolved into a suite of databases and associated software for understanding and simulating higher-order functional behaviors of the cell or the organism from its genome information According to the developers they consider KEGG to be a "computer representation" of the biological system.

Mode of working:
  • Firstly, KEGG computerizes data and knowledge on protein interaction networks (PATHWAY database) and chemical reactions (LIGAND database) that are responsible for various cellular processes.
  • Secondly, KEGG attempts to reconstruct protein interaction networks for all organisms whose genomes are completely sequenced (GENES and SSDB databases).
  • Third, KEGG serves as a reference knowledge repository for functional genomics (EXPRESSION database) and proteomics (BRITE database) experiments.
 
MBGD (Microbial Genomic Database):
MBGD is a database for comparative analysis of completely sequenced microbial genomes and is available at http://mbgd.genome.ad.jp/.

The aim of MBGD is to facilitate comparative genomics from various points of view such as ortholog identification, paralog clustering, motif analysis and gene order comparison. Therefore the main objective of 

MBGD function is to create orthologous or homologous gene cluster table and it not only provides comprehensive orthologous groups among the latest genomic data available, but also allows users to create their own ortholog groups on the fly using a specified set of organisms. 

The database now contains well over 1000 published genomes including 27 eukaryotic microbes and 4 multicellular organisms.
  
Mode of working: 
  • Firstly, similarities between all genes are precomputed and stored into the database, in addition to the annotations of genes such as function categories that were assigned by the original authors and motifs that were found in the translated sequence.
  • Using these homology data, MBGD dynamically creates orthologous gene cluster table
  • Users can further analyze multiple genomes from various points of view with the functions such as global map comparison, local map comparison, multiple sequence alignment and phylogenetic tree construction

No comments:

Post a Comment