Cell cycle kinases predicted from conserved biophysical properties.

TitleCell cycle kinases predicted from conserved biophysical properties.
Publication TypeJournal Article
Year of Publication2009
AuthorsWrzeszczynski, KO, Rost, B
Date Published2009 Feb 15
KeywordsAmino Acid Motifs, Artificial Intelligence, Binding Sites, Cell Cycle, Cell Cycle Proteins, Conserved Sequence, Databases, Protein, Evolution, Molecular, Humans, Models, Molecular, Protein Conformation, Protein-Serine-Threonine Kinases

Machine-learning techniques can classify functionally related proteins where homology-transfer as well as sequence and structure motifs fail. Here, we present a method that aimed at complementing homology-transfer in the identification of cell cycle control kinases from sequence alone. First, we identified functionally significant residues in cell cycle proteins through their high sequence conservation and biophysical properties. We then incorporated these residues and their features into support vector machines (SVM) to identify new kinases and more specifically to differentiate cell cycle kinases from other kinases and other proteins. As expected, the most informative residues tend to be highly conserved and tend to localize in the ATP binding regions of the kinases. Another observation confirmed that ATP binding regions are typically not found on the surface but in partially buried sites, and that this fact is correctly captured by accessibility predictions. Using these highly conserved, semi-buried residues and their biophysical properties, we could distinguish cell cycle S/T kinases from other kinase families at levels around 70-80% accuracy and 62-81% coverage. An application to the entire human proteome predicted at least 97 human proteins with limited previous annotations to be candidates for cell cycle kinases.

Alternate JournalProteins
PubMed ID18704950
PubMed Central IDPMC2629806
Grant ListR01 GM079767-01A1 / GM / NIGMS NIH HHS / United States
R01 LM007329-06 / LM / NLM NIH HHS / United States
R01-LM07329-01 / LM / NLM NIH HHS / United States
U54-GM074958-01 / GM / NIGMS NIH HHS / United States