Protein sequence & structure resources
Protein Sequence Databases
Entrez Protein (NCBI)
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein
The content of the Entrez Protein database is derived from several protein
databases (Swiss-Prot, PIR, PDB) and translations of the GenBank nucleotide
sequence records. Records link to related files for nucleotide and genome sequences
and, where available, structural information.
IPI International Protein Index (EMBL-EBI, European Bioinformatics
Institute)
http://www.ebi.ac.uk/IPI/IPIhelp.html
IPI
provides access to the major databases covering the proteomes of higher eukaryotic
organisms along with a database of cross references between the primary data
sources. IPI is assembled from protein sequence information taken from: UniProtKB/Swiss-Prot
and UniProtKB/TrEMBL, RefSeq, ENSEMBL, TAIR, Vega, and H-invDB. IPI creates
a complete, minimally redundant, set of proteins for featured species.
PIR-PSD (Protein Information Resource-International Protein Sequence
Database)
http://pir.georgetown.edu/pirwww/dbinfo/pir_psd.shtml
PIR-PSD was the world's first database of classified and functionally annotated
protein sequences. The database is nonredundant and annotated by experts. The
final version of PIR was released on PIR-PSD in Dec 2004. PIR is one of the
co-founders of the UniProt consortium. PIR-PSD sequences and annotations are
integrated into the UniProt Knowledgebase and with bi-directional cross-references
between the two databases. PIR also provides several other protein related databases.
RefSeq (The Reference Sequence Collection; NCBI)
http://www.ncbi.nlm.nih.gov/RefSeq/
RefSeq aims to provide a comprehensive, non-redundant set of protein sequences
for major research organisms (genomic DNA and transcript (RNA) are also included).
RefSeq standards provide a stable reference for expression studies and comparative
analyses.
Swiss-Prot (Swiss Institute of Bioinformatics and the European Bioinformatics
Institute)
http://www.ebi.ac.uk/swissprot/
Swiss-Prot
is a manually curated and extensively annotated protein sequence database with
a minimal level of redundancy. Annotations may include descriptions of protein
function, domains, structure, and post-translational modifications. Swiss-Prot
is also integrated with other databases. Swiss-Prot is part of the UniProt
consortiumand its contents have been incorporated into the UniProt KnowledgeBase.
TrEMBL (Swiss Institute of Bioinformatics and the European Bioinformatics
Institute)
http://www.ebi.ac.uk/trembl/
TrEMBL (Translated EMBL) is an automatically annotated supplement to Swiss-Prot.
The database contains the translations of EMBL/GenBank/DDBJ nucleotide sequences.
TrEMBL is part of the UniProt
consortium and its contents have been incorporated
into the UniProt KnowledgeBase.
UniProt (Universal Protein Resource)
http://www.ebi.uniprot.org
Uni-Prot
provides access to extensive and expertly curated protein information, including
function, classification and cross-references. UniProt was created by joining
the information contained in UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, and PIR.
UniProt has three sections:
- The UniProt Knowledgebase (UniProtKB) is the central
database containing extensive and expertly annotated protein sequences
from Swisss-Prot, TrEMBL, and PIR-PSD.
- The UniProt Reference Clusters (UniRef) is based on the UniProt knowledgebase. UniRef provide non-redundant reference
data by combining closely related sequences into a single record.
- UniProt
Archive (UniParc) is a comprehensive repository that stores the
complete body of publicly available protein sequence data. Each unique
sequence has a single record with cross-references to the
data source. Source databases include, in addition to the UniProt
databases, translations from the EMBL-Bank/DDBJ/GenBank nucleotide sequence
databases, the International Protein Index (IPI), the Protein Data Bank
(PDB), NCBI's Reference Sequence Collection (RefSeq), and several other
databases.
Protein Structure Databases
Entrez Structures (The Molecular Modeling Database (MMDB) at NCBI)
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Structure
MMDB
contains 3D protein structures obtained primarily from X-ray crystallography
and NMR spectroscopy. The data is a subset of the PDB database. Structure information
is linked to the rest of the NCBI databases, including sequences, citations,
and taxonomic classifications.
Enzyme Structures Database (EC-PDB)
http://www.ebi.ac.uk/thornton-srv/databases/enzymes/
The database contains known enzyme structures that have been deposited in
the Protein Data Bank (PDB). Enzyme structures are classified by their E.C.
number and the database is searchable by E.C. number or keywords.
Macromolecular Structure Database (MSD)
http://www.ebi.ac.uk/msd
MSD is a
member of the wwwPDB consortion. The wwwPDB is entrusted with the creation
of a central depository for the collection, management and distribution of
information about macromolecular structures. MSD is involved with several
projects where it helps develop tools and systems for ensuring the quality
of structure data deposited in databases. Examples of projects include working
with the EMDB—Electron Microscopy Database and the CCPN—Collaborative
Computing Project for the NMR community.
ModBase (UCSF)
http://modbase.compbio.ucsf.edu/modbase-cgi-new/search_form.cgi
ModBase
is database of computationally derived protein structure models calculated by
comparative modeling—structure related information is included for
all sequences related to a known structure. In addition to protein structure
models, MODBASE contains information about putative ligand binding sites, and
protein-protein interactions, and SNP annotation.
Protein Data Bank
http://www.rcsb.org/pdb/Welcome.do
Primary database archive of experimentally determined three-dimensional structures
of proteins and protein complexes. Three-dimensional structure is shown in
two types of display: atomic coordinates and annotations. Includes links to
sequence databases. PDB is a member of the wwPDB whose mission is to maintain
a publically available single Protein Data Bank Archive of macromolecular structural
data.
|