Skip to Quick Links BarSkip to Page Content
NCSU Libraries
Search the Collection
Browse Subjects
Services
Library Information
Community
News & Events
Services
Get Answers Now

Home: BIT 410/510

Sequence Information

Databases

Journals

Methods & Protocols

Cloning Links

Search Tips
Gen Bank & Pub Med
BLAST

BIT 410: Manipulation of Recombinant DNA

Searching GenBank and PubMed

The Databases | Basic Searching | Results Page Features |
Improving Searches
| Genbank Record | Resources & Assistance

Both PubMed and GenBank are searched using Entrez---the National Library of Medicine’s integrated cross-database search system. Since there are so many similarities in searching these two databases, this guide addresses them together. In cases where there is a difference, a table describing the two systems will be inserted.

1. Find the Databases:

Start all your searches at the NLM website: http://www.ncbi.nlm.nih.gov/

GenBank PubMed

Choose Nucleotide from the drop down Search menu.

Choose PubMed from the Search menu (this is the default).
  • Using Entrez Nucleotide will limit your search to several sequence databases including GenBank and EMBL, DDBJ (international partners) and a few other sources, such as patents.
  • Generally, for this assignment, searching Entrez Nucleotide is fine; but you can use limits (described below) so that you are ONLY searching GenBank, or an even more focused database, such as RefSeq.
  • PubMed indexes journal articles in the medical and allied health literature, including preclinical sciences.
  • PubMed uses a sophisticated controlled vocabulary called MeSH. You can search without using MeSH, but if you are having problems getting your search to be specific enough, please contact me for help, or explore MeSH yourself at: http://www.nlm.nih.gov/mesh/meshhome.html


Note: You can also search all the Entrez databases at once, and then just look at the results in your databases of interest. From the PubMed home page, click on “all databases” on the left side of the black near the top of the page. Enter your search in the box, and the number of results will be shown next to each database. Click on the name of the database to see the results—the number of hits is listed next to the database.

null Top of Page

2. Ways to Search the Databases

To start searching, simply type in a term or combination of terms, such as clock gene or ovarian cancer. The database automatically puts an “AND” into your search unless you use quotes to designate a specific phrase, e.g., “ovarian cancer”.

A very specific way to search is to use the Gene Symbol—if you know it.

Numerous databases are available online for looking up gene symbols. These are usually arranged by organism, e.g., human, rat, mouse, etc.
To locate a database, just to go Google (www.google.com) and type in: gene symbol database
HUGO is a human gene symbol database: http://www.gene.ucl.ac.uk/nomenclature/

Other Important Search Information:

  • Boolean operators (AND, OR, NOT)—must be in uppercase
  • The truncation symbol is an asterisk (*)—used for variant word endings, and singulars and plurals
  • Use “quotes” for an exact phrase

null Top of Page

3. Looking at your Results I: Special Features

Tabs—If you look at the gray bar immediately above your search results, you will see some tabs. This is a new Entrez feature and lets you select a predefined category of results to look at.

Tabs in Entrez Nucleotide Tabs in PubMed

All (all sequences in your results)
Bacteria
mRNA
RefSeq

All (all articles in your results)

Review (for review articles)

Links—there is also a “links” hyperlink associated with each record—you can access this from the results list, or from the page with the individual record. Links is on the right side of the page and lets you link to related information within the Entrez system.

  • Click on Links to see what related information is available for your record, the number of available links varies quite a bit.
  • Some potentially useful links include OMIM (Online Mendelian Inheritance in Man), PubMed (articles relating to the gene), and several others.
Links in Entrez Nucleotide Links in PubMed

Gene
Protein
PubMed
OMIM
Related Sequences
SNPS
Taxonomy
Many others when available

Nucloetide
Protein
OMIM
Others when available
However, there seem to be far fewer links from PubMed to Nucleotide than the reverse

It is highly recommended to start your search in Nucleotide and
use the Links feature to go to PubMed records.

null Top of Page

4. Looking at your Results II: Focusing your Search

If you have too many results, or too many irrelevant results, you can focus your search in several ways. Begin by clicking on the first tab, “Limits”, located just below the search box.

Field Searching

Each GenBank and PubMed record has many sections that contain specific types of information, these are called fields (examples of fields are author and abstract in PubMed; gene name and sequence in GenBank).

One way to focus your search is to look for relevant terms only in certain fields. On the Entrez limits page is a drop-down menu for you to select a field. You can also put the field you want to search in square brackets following your search term-e.g., Marriott[AUTH] for all research authored by Marriott. This is especially useful if you want to search several fields.

Fields available in both GenBank and PubMed:

  • accession number[ACCN], gene name[GENE], organism[ORGN], properties[PROP], protein name[PROT]
  • fields related to a publication, including author[AUTH], journal name[JOUR], and publication date[PDAT]

Fields in GenBank: the title[TITL] field of a record can be a very useful field to search as it includes the organism, product, gene symbol, molecule type, and indicates if the record contains a partial or complete CDS (coding sequence).

Fields in PubMed: the specialized subject terminology that PubMed uses (MeSH-- for medical subject headings) can also help you make your search bvery specific. For more information about MeSH, see: http://www.nlm.nih.gov/mesh/meshhome.html

Exclusions (in GenBank only)

Just below the Fields dropdown menu is a selection of data types. By clicking in the appropriate box(es), GenBank gives you the option to exclude any or all of these data types, such as patents or ESTs, from your search.

Search Limits

Several dropdown menus are next on the limits page. They let you limit your search, for example, to a specific type of molecule (genomic DNA, mRNA, protein) or a specific type of document (clinical trial, review, etc.).
Many of the limits differ between GenBank and PubMed—some of the more useful limits for each database are listed in this table:

Limits in Entrez Nucleotide Limits in PubMed
Type of molecule: genomic DNA, mRNA, protein
Gene location: nucleus, mitochondria, chloroplast
Database you want to search (‘only from’): GenBank, RefSeq, etc.
Publication type: review, meta-analysis, clinical trial
Human or animal studies: either one or both
Language
Gender/age group (for human studies)
Publication date

For more information on Fields and Limits in GenBank and PubMed, see this Entrez Help Document: http://www.ncbi.nlm.nih.gov/entrez/query/static/help/Summary_Matrices.html


5. Interpreting a GenBank Record

Sometimes a GenBank record can be confusing. The NCBI web site has a sample GenBank record with hyperlinks that take you to a description of each element. Find it at: http://www.ncbi.nih.gov/Sitemap/samplerecord.html


6. Resources and Assistance

This guide has only introduced the basics of Entrez searching. There are numerous other useful features available. Many helpful guides and resources are available on the NCBI web site, these links are good places to start:

null Top of Page

Librarian Contact Information

NCSU Libraries Copyright | Disclaimer | Accessibility | Text Only | Contact Us | Staff Only NC State University