Redefining the goals of protein secondary structure prediction

TitleRedefining the goals of protein secondary structure prediction
Publication TypeJournal Article
Year of Publication1994
AuthorsRost, B, Sander, C, Schneider, R
JournalJ Mol Biol
KeywordsAmino Acid Sequence Escherichia coli/enzymology Lactobacillus/enzymology Mathematics Molecular Sequence Data *Protein Folding *Protein Structure, Secondary Proteins/*chemistry Tetrahydrofolate Dehydrogenase/chemistry

Secondary structure prediction recently has surpassed the 70% level of average accuracy, evaluated on the single residue states helix, strand and loop (Q3). But the ultimate goal is reliable prediction of tertiary (three-dimensional, 3D) structure, not 100% single residue accuracy for secondary structure. A comparison of pairs of structurally homologous proteins with divergent sequences reveals that considerable variation in the position and length of secondary structure segments can be accommodated within the same 3D fold. It is therefore sufficient to predict the approximate location of helix, strand, turn and loop segments, provided they are compatible with the formation of 3D structure. Accordingly, we define here a measure of segment overlap (Sov) that is somewhat insensitive to small variations in secondary structure assignments. The new segment overlap measure ranges from an ignorance level of 37% (random protein pairs) via a current level of 72% for a prediction method based on sequence profile input to neural networks (PHD) to an average 90% level for homologous protein pairs. We conclude that the highest scores one can reasonably expect for secondary structure prediction are a single residue accuracy of Q3 > 85% and a fractional segment overlap of Sov > 90%.