Databases available for MAST search

MAST -- Motif Alignment and Search Tool

Motif search tool


Sequence databases available for MAST search


The sequence databases that MAST can search are grouped into three categories:


Assorted Databases

     Various peptide and nucleotide databases, including those searchable by NCBI BLAST.

alu
Select Alu repeats from REPBASE, suitable for masking Alu repeats from quer y sequences. It is available by anonymous FTP from ncbi.nlm.nih.gov (under the /pub/jmc/alu directory). See "Alu alert" by Claverie and Makalowski, Nature v ol. 371, page 752 (1994).

C. elegans (coding)
Wormpep: predicted proteins from the Caenorhabditis elegans genome sequencing project; http://www.sanger.ac.uk/Projects/C_elegans/wormpep for more information

C. elegans - coding
the DNA sequences from which the Wormpep protein sequences are derived (effectively the cDNA sequence BUT with no UTRs); see http://www.sanger.ac.uk/Projects/C_elegans/wormpep for more information

D. discoideum
Protein and nucleotide Dictyostelium discoideum databases

Drosophila
Drosophila genome proteins and nucleotides provided by Celera and Berkeley Drosophila Genome Project (BDGP).

E. coli
Complete E. coli genome proteins and nucleotides from NCBI BLAST databases.

epd
Eucaryotic Promoter Database found on the web at http://www.genome.ad.jp/dbget-bin/www_bfind?epd

est
Non-redundant Database of GenBank+EMBL+DDBJ EST Divisions

mouse ESTs

human ESTs

other ESTs

genpept
GENPEPT peptide database

gss
Genome Survey Sequence, includes single-pass genomic data, exon-trapped sequences, and Alu PCR sequences.

htgs
High Throughput Genomic Sequences

kabat
Kabat's database of peptide and nucleotide sequences of immunological interest.

mito
Database of mitochondrial sequences

month
Peptide: All new or revised GenBank CDS translation+PDB+SwissProt+PIR released in the last 30 days.
Nucleotide: All new or revised GenBank+EMBL+DDBJ+PDB sequences released in the last 30 days.

nr
Peptide: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR
Nucleotide: All Non-redundant GenBank+EMBL+DDBJ+PDB sequences (but no EST's or STS's)

pdb
Sequences (peptide and nucleotide) derived from the 3-dimensional structure Protein Data Bank.

sts
Non-redundant Database of GenBank+EMBL+DDBJ STS Divisions

S. cerevisiae (yeast)
Yeast (Saccharomyces cerevisiae) peptide and nucleotide sequences.

swissprot
The last major release of the SWISS-PROT peptide sequence database (no updates).

vector
Vector subset of GenBank(R), NCBI, in ftp://ncbi.nlm.nih.gov/blast/db/.

Genbank Single Organism Databases

     Single organism peptide (*.faa files) and nucleotide (*.fna files) from
Genbank.

Aeropyrum pernix K1
Peptide and nucleotide sequences from Genbank for Aeropyrum pernix K1.

Archaeoglobus fulgidus
Peptide and nucleotide sequences from Genbank for Archaeoglobus fulgidus.

Aquifex aeolicus
Peptide and nucleotide sequences from Genbank for Aquifex aeolicus.

Aquifex aeolicus ece1
Peptide and nucleotide sequences from Genbank for Aquifex aeolicus ece1.

Borrelia burgdorferi
Peptide and nucleotide sequences from Genbank for Borrelia burgdorferi.

Borrelia burgdorferi 11 plasmids
Peptide and nucleotide sequences from Genbank for Borrelia burgdorferi 11 plasmids.

Bacillus subtilis
Peptide and nucleotide sequences from Genbank for Bacillus subtilis.

Chlamydia trachomatis
Peptide and nucleotide sequences from Genbank for Chlamydia trachomatis.

Chlamydia muridarum
Peptide and nucleotide sequences from Genbank for Chlamydia muridarum.

Chlamydia pneumoniae
Peptide and nucleotide sequences from Genbank for Chlamydia pneumoniae.

Chlamydophila pneumoniae AR39
Peptide and nucleotide sequences from Genbank for Chlamydophila pneumoniae AR39.

Chlamydophila pneumoniae J138
Peptide and nucleotide sequences from Genbank for Chlamydophila pneumoniae J138.

Deinococcus radiodurans R1 chromosome 1
Peptide and nucleotide sequences from Genbank for Deinococcus radiodurans R1 chromosome 1.

Eschericia coli
Peptide and nucleotide sequences from Genbank for Eschericia coli.

Haemophilus influenzae Rd
Peptide and nucleotide sequences from Genbank for Haemophilus influenzae Rd.

Helicobacter pylori 26695
Peptide and nucleotide sequences from Genbank for Helicobacter pylori 26695.

Helicobacter pylori strain J99
Peptide and nucleotide sequences from Genbank for Helicobacter pylori strain J99.

Mycoplasma genitalium
Peptide and nucleotide sequences from Genbank for Mycoplasma genitalium.

Methanococcus jannaschii
Peptide and nucleotide sequences from Genbank for Methanococcus jannaschii.

Methanococcus jannaschii large extrachromosomal element
Peptide and nucleotide sequences from Genbank for Methanococcus jannaschii large extrachromosomal element.

Methanococcus jannaschii small extrachromosomal element
Peptide and nucleotide sequences from Genbank for Methanococcus jannaschii small extrachromosomal element.

Mycoplasma pneumoniae
Peptide and nucleotide sequences from Genbank for Mycoplasma pneumoniae.

Methanobacterium thermoautotrophicum
Peptide and nucleotide sequences from Genbank for Methanobacterium thermoautotrophicum.

Mycobacterium tuberculosis H37Rv
Peptide and nucleotide sequences from Genbank for Mycobacterium tuberculosis H37Rv.

Neisseria meningitidis serogroup B strain MC58
Peptide and nucleotide sequences from Genbank for Neisseria meningitidis serogroup B strain MC58.

Neisseria meningitidis serogroup A strain Z2491
Peptide and nucleotide sequences from Genbank for Neisseria meningitidis serogroup A strain Z2491.

Pyrococcus abyssi
Peptide and nucleotide sequences from Genbank for Pyrococcus abyssi.

Pyrococcus horikoshii
Peptide and nucleotide sequences from Genbank for Pyrococcus horikoshii.

Rhizobium sp. NGR234 complete plasmid sequence
Peptide and nucleotide sequences from Genbank for Rhizobium sp. NGR234 complete plasmid sequence.

Rickettsia prowazekii strain Madrid E
Peptide and nucleotide sequences from Genbank for Rickettsia prowazekii strain Madrid E.

Synechocystis PCC6803
Peptide and nucleotide sequences from Genbank for Synechocystis PCC6803.

Thermotoga maritima
Peptide and nucleotide sequences from Genbank for Thermotoga maritima.

Treponema pallidum
Peptide and nucleotide sequences from Genbank for Treponema pallidum.

Ureaplasma urealyticum
Peptide and nucleotide sequences from Genbank for Ureaplasma urealyticum.

Xylella fastidiosa
Peptide and nucleotide sequences from Genbank for Xylella fastidiosa.

Upstream Sequence Databases

     Nucleotide sequences located upstream from the coding region of a gene. Each database contains one sequence for each known gene in a particular organism. The origin is at the start codon. The given range of sequence was extracted using the
"retrieve sequence" tool. Negative numbers refer to base pairs upstream of the origin, positive numbers to base pairs downstream from the origin.

B. subtilis (upstream)
Sequence in the range of -500 to +50 relative to the start codon of each gene.

E. coli (upstream)
Sequence in the range of -500 to +50 relative to the start codon of each gene.

E. coli (upstream)
Sequence in the range of -950 to +50 relative to the start codon of each gene.
Search using MAST
MAST introduction
MEME SYSTEM introduction