banner rostlab-logo
 
Research

Publications

Talks

Services



Software

Web Services

Downloads

Downloads





Group

People

Contact

Positions

Internal




consensus.jpg

 Intro 
 Submit 
 Sample output 
 Download 
 Help 
 

 

This is a method for searching and aligning databases of consensus sequences. It is developed by Dariusz Przybylski at the Rost Group, at Columbia University, New York.

Publication abstract: Sequence alignments may be the most fundamental computational resource for molecular biology. The best methods that identify relatedness through profile-profile comparisons are much slower and more complex than sequence-sequence and sequence-profile comparisons such as, respectively, BLAST and PSI-BLAST. Families of related genes and gene products (proteins) can be represented by consensus sequences that list the nucleic/amino acid most frequent at each sequence position in that family. Here we proposed a novel approach for consensus sequence-based comparisons. This approach improved searches and alignments as a standard add-on to PSI-BLAST without any changes of code. Improvements were particularly significant for more difficult tasks such as the identification and alignment of distant structural relations between proteins. Despite the fact that the improvements were higher for more divergent relations, they were consistent even at high accuracy/low error rates for non-trivially related proteins. The improvements were very easy to achieve: No parameter used by PSI-BLAST was altered and no single line of code changed. On top the consensus sequence add-on required relatively little additional CPU time. Thus, advanced users of PSI-BLAST can immediately benefit from using consensus sequences on their local computers. In addition we made the method available through the Internet.

PLEASE NOTE (!!!): in general the amino acid composition of consensus sequences differs from the composition of original sequences (consequently the statistical significance of alignment scores as reported by PSI-BLAST could likely be incorrect even though relative ordering of scores seem to be very good). Therefore in here we present consensus sequences that on average have SIMILAR composition to the original sequences (their construction differs from the one described in our first paper).

From this site you can:

 

If you find this method useful for your research, please cite:

  • Przybylski D & Rost B (2007) Consensus sequences improve PSI-BLAST through mimicking profile-profile alignments searches. Nucleic Acids Res. 2007;35(7):2238-46.
  • Przybylski D & Rost B (2008) Powerful fusion: PSI-BLAST and consensus sequences. Bioinformatics. 2008 Aug4 [Epub ahead of print].
©2008 rostlab.org
1130 St. Nicholas Ave, 8th. floor - (212) 851-4669
columbia.edu | biochemistry | biosof