Fasta format | |||||||||||||||
Sequences in fasta formatted files are preceded by a line
starting with >. The first word on this line is the name of the sequence. The rest of the line is a description of the sequence. The first character must be a digit or a letter. The remaining lines contain the sequence itself. Blank lines in a FASTA file are ignored, and so are spaces or other gap symbols (dashes, underscores, periods) in a sequence.
| |||||||||||||||
Clustal format | |||||||||||||||
Clustal format files contain the word clustal at the beginning:
| |||||||||||||||
Msf format | |||||||||||||||
msf formatted multiple sequence files are most often
created when using programs of the GCG suite. msf files
include the sequence name and the sequence itself, which is
usually aligned with other sequences in the file. You can
specify a single sequence or many sequences within an msf
file. An example of part of an msf file, created using the GCG multiple sequence alignment program:
| |||||||||||||||
Some of the hallmarks of a msf formatted sequence are the
same as a single sequence gcg format file:
| |||||||||||||||
Visualization of PFAM motifs on T-COFFEE alignments | |||||||||||||||
METHOD:
Every hit to a PFAM motif (E-value < 0.1) will be mapped onto the multiple
alignment from T-COFFEE using a unique color (green, red, yellow, blue, ...).
Such a hit corresponds to an alignment between the sequence and the PFAM motif
(see HMMOUT files that are also available on the T-COFFEE site), where
exactly and weekly conserved residues, as well as dominant residues of the
motif are indicated.
We map the information from this alignment (HMMOUT) onto the T-COFFEE
alignment in the following manner:
INTERPRETATION:
The COFFEE score reflects the level of consistency between a multiple sequence alignment and a library containing pairwise alignments of the same sequences.
|