Improving fold recognition without folds.

TitleImproving fold recognition without folds.
Publication TypeJournal Article
Year of Publication2004
AuthorsPrzybylski, D, Rost, B
JournalJ Mol Biol
Volume341
Issue1
Pagination255-69
Date Published2004 Jul 30
ISSN0022-2836
KeywordsDatabases, Protein, Models, Molecular, Protein Folding, Proteins, Sequence Alignment, Sequence Analysis, Protein
Abstract

The most reliable way to align two proteins of unknown structure is through sequence-profile and profile-profile alignment methods. If the structure for one of the two is known, fold recognition methods outperform purely sequence-based alignments. Here, we introduced a novel method that aligns generalised sequence and predicted structure profiles. Using predicted 1D structure (secondary structure and solvent accessibility) significantly improved over sequence-only methods, both in terms of correctly recognising pairs of proteins with different sequences and similar structures and in terms of correctly aligning the pairs. The scores obtained by our generalised scoring matrix followed an extreme value distribution; this yielded accurate estimates of the statistical significance of our alignments. We found that mistakes in 1D structure predictions correlated between proteins from different sequence-structure families. The impact of this surprising result was that our method succeeded in significantly out-performing sequence-only methods even without explicitly using structural information from any of the two. Since AGAPE also outperformed established methods that rely on 3D information, we made it available through. If we solved the problem of CPU-time required to apply AGAPE on millions of proteins, our results could also impact everyday database searches.

DOI10.1016/j.jmb.2004.05.041
Alternate JournalJ. Mol. Biol.
PubMed ID15312777
Grant List1-P50-GM62413-01 / GM / NIGMS NIH HHS / United States
R01-GM63029-01 / GM / NIGMS NIH HHS / United States