Nucleic Acid Sequences:
dna and RNA
Note: These web sites refer primarily to large collections
or major repositories. For a listing of genome and sequence information for
specific organisms, or groups, such as families—please go the Organisms page.
Other sequence resources are available on the Genomes page.
Entrez Nucleotides (GenBank; NCBI)
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Nucleotide
Entrez
Nucleotides is a collection of sequences from several sources, including GenBank,
RefSeq, and PDB. GenBank is an annotated collection of all publicly available
nucleotide sequences and their protein translations. GenBank contains direct
submissions from individual laboratories and bulk submissions from large-scale
sequencing centers.
- Genbank at NCBI is part of an international collaboration
with the European Molecular Biology Laboratory (EMBL) Data Library (http://www.ebi.ac.uk/embl/)
from the European Bioinformatics Institute (EBI) and the DNA Data Bank of
Japan (DDBJ; http://www.ddbj.nig.ac.jp/).
Each group collects a part of the total sequence data reported worldwide
and all new and updated entries are exchanged between the groups
daily.
- International Nucleotide Sequence Database Collaboration
http://www.ncbi.nlm.nih.gov/projects/collab/
GenBank,
along with partners DDBJ and EMBL, have launched www.insdc.org,
a site with information about the aims, policies, and projects of this collaboration.
EMBL Nucleotide Sequence Database
http://www.ebi.ac.uk/embl/index.html
EMBL
is Europe's primary nucleotide sequence resource and is one of three partners
maintaining comprehensive publicly available sequence databases. The main sources
for DNA and RNA sequences are direct submissions from individual researchers,
genome sequencing projects, and patent applications.
RefSeq (The Reference Sequence Collection; NCBI)
http://www.ncbi.nlm.nih.gov/RefSeq/
RefSeq provides a biologically non-redundant collection of DNA, RNA, and
protein sequences. Each sequence represents a single molecule from a specific
organism. Reference Sequences are manually curated and annotated to provide synthesis
of information (and sequence records), similar to a review article in the literature.
Each molecule is annotated with the organism name, gene symbol, informative protein
names when possible, and links to relevant information in other NCBI databases.
DoTS (Database Of Transcribed Sequences)
http://www.allgenes.org/index.html
DoTS
is a sequence index created from all publicly available human and mouse transcript
sequences. The project’s purpose is to integrate various types
of data (e.g., EST sequences, genomic sequence, expression data, functional
annotation) in a structured manner to facilitate access. Input sequences are
clustered and assembled to form DoTS Consensus Transcripts. The Transcripts
are extensively annotated and a significant number have been manually curated.
|