bottom - TOC - CUBIC-papers - CUBIC - Rost group

Title: PROFbval: predict flexible and rigid residues in proteins
Author:Avner Schlessinger , Guy Yachdav, Burkhard Rost
Quote: ioinformatics, 2006, 22(7),891-3, Epub 2006

PROFbval: predict flexible and rigid residues in proteins

Avner Schlessinger 1,2,*, Guy Yachdav 1,2 and Burkhard Rost 1,2,3

1 Dept. of Biochemistry and Molecular Biophysics, Columbia University, 630 West 168th Street, New York, NY 10032, USA
2 Columbia University Center for Computational Biology and Bioinformatics (C2B2), 1130 St. Nicholas Avenue Rm 802, New York, NY 10032, USA
3 North East Structural Genomics Consortium (NESG), Department of Biochemistry and Molecular Biophysics, Columbia University, 630 West 168th Street, New York, NY 10032, USA
* Corresponding author: schlessinger@rostlab.org URL http://www.rostlab.org/  Tel: +1-212-851-4669

This article is published in ( Bioinformatics, 2006, 22, 891-3 ) © copyright Oxford University Press (2006). OUP is the only authorized source. All copying of this article including placing on another website requires the written permission of the copyright owner.

Table of contents


Abstract

Summary: The mobility of a residue on the protein surface is closely linked to its function. The identification of extremely rigid or flexible surface residues can therefore contribute information crucial for solving the complex problem of identifying functionally important residues in proteins. Mobility is commonly measured by B-value data from high-resolution three-dimensional X-ray structures. Few methods predict B-values from sequence. Here, we present PROFbval, the first web server to predict normalized B-values from amino acid sequence. The server handles amino acid sequences (or alignments) as input and outputs normalized B-value and two-state (flexible/rigid) predictions. The server also assigns a reliability index for each prediction. For example, PROFbval correctly identifies residues in active sites on the surface of enzymes as particularly rigid.

Availability:   http://www.rostlab.org/services/profbval

Contact: profbval@rostlab.org

Key words: flexibility prediction, protein dynamics, protein motion, protein structure prediction, solvent accessibility, multiple alignments, secondary structure prediction, protein function prediction, enzyme active sites, conformational switch.

PAPER

Protein flexibility and rigidity linked to function.  The function of a protein is often determined by the particular details of its native three-dimensional (3D) structure. These details may be relevant to rigid as well as to flexible regions. The importance of structural rigidity is illustrated by the 'tunnel' in beta-propeller folds that appears critical for ligand coordination and catalytic activity [1] . The importance of flexibility is illustrated by many biological processes, including molecular recognition and catalytic activity [2, 3, 4, 5, 6, 7, 8] , e.g., the flexibility of the switch II region in Ras is crucial for its GTPase activity [9] . Additionally, several groups have shown that motions occur in enzymes during catalysis - possibly by lowering the transition state barrier [10] . Such motions even happen in the enzyme cyclophilin A in its free form [11] . Interestingly, these motions are not restricted to the active sites but involve many core residues creating a wide dynamic network [11, 10] .

Mobility and disorder. Over the last decade, evidence has been accumulated that many proteins have regions that appear unstructured in isolation [12, 13, 6, 7, 8] . One hypothesis has it that not adopting particular shapes in isolation enables the adaptation to many different binding interfaces, i.e. increase the complexity realizable by a single molecule. Many of such 'natively unstructured' or 'disordered' residues are likely to be rather flexible [14] . Overall, however, the conceptual connection between flexible and natively unstructured remains obscure. Clearly, methods that predict disorder cannot predict rigid active site residues.

Experimental B-values. B-values reflect the local mobility of protein backbones and are available for structures determined by X-ray crystallography [15, 16] . Experimental B-values depend on the experimental resolution, on crystal contacts, and on the refinement procedures [17, 18] . These influences are reduced by the following normalization [19] :

  (Eq. 1)

where s is the standard deviation and <B> the average over all C-alpha B-values in a given protein. About 99.3% the normalized B-values lie between -3 (rigid) and +3 (flexible). While high values (flexible) have been correlated with biological activities such as antigenic recognition [2] and catalytic activity [7] , low values (rigid) have been correlated with, active sites in enzymes [20, 21] . B-values also correlate with NMR relaxation data, which is a widely used technique to investigate protein dynamics [22, 23] .

Predicted B-values and the like.  Methods for the prediction of some aspects of mobility have been around since long [15, 16] ; more recently three groups developed prediction methods explicitly optimized to predict B-values from X-ray structures [24, 23, 25] . Here, we introduce the first web-based interface for prediction of protein flexibility/rigidity based on B-values. The method can assist in the prediction of both protein structure and function. For instance, a biologist can locate potentially antigenic determinants by identifying the most flexible residues on the protein surface. Additionally, a crystallographer can locate residues that potentially have high experimental B-values.

Prediction method. PROFbval is a neural-network based prediction method [23] . The network was trained and tested on a large non-redundant set of high-resolution (≤2.5 Angstrom) protein structures taken from the EVA server [23]. The experimental B-values were normalized according to Eq. 1. The network was trained on properties that can be obtained from its primary amino-acid sequence: secondary structure predicted by PROFsec [26] , and solvent accessibility predicted by PROFacc [26] . The use of evolutionary profiles instead of raw sequences increased performance considerably, and the use of global information such as the content in predicted regular secondary structure, the ratio of residues predicted on the surface, and the protein length improved performance marginally.

Estimating performance. We have evaluated PROFbval based on many measures[23] . To simplify: PROFbval predictions are not as accurate as homology-based predictions would be for very similar proteins, and they are significantly more accurate than a method that considers all surface residues as flexible. The Pearson correlation coefficient between observed and predicted normalized B-values reached levels around 0.49. More importantly, we validated our prediction method by trying to solve simple biological tasks [23] (new findings are reported in the Supplementary Online Material). 

Input to server. Users can either submit a raw protein sequence or a sequence alignment. Additionally, the server allows the specification of some optional parameters such as a job name (for control/ease-of-retrieval), strict or non-strict output modes, and window sizes (used for smoothing the graphical output; default is a window of 1; only odd values accepted).

Output from server. PROFbval returns results in ASCII (raw text) and/or HTML (default). Both formats are returned either directly through the web browser/protocol used for submission or through email. Three types of information are displayed. The first is a two-state prediction (flexible/rigid). There are two modes for this prediction type: non-strict and strict depending on the particular choice in the threshold for 'flexible'. In the non-strict mode most residues are flexible; hence, a residue on the surface that is predicted as rigid is likely to have a functional role. Conversely, in the strict mode only about one third of the residues are flexible; therefore, a stretch of residues that is predicted as flexible might be important for function. The second output gives the prediction reliability: The higher the reliability index, the stronger and better the prediction. The third output gives normalized B-values predicted for each residue. The predicted values are on the same scale as the experimental normalized B-values. This type of output can also be viewed in a graphical format ( Fig. 1 A). Additionally, the results page will be available to download from our website and only URLs are sent to the users by email unless the user requests the full results being sent directly.

Fig. 1
fig1.gif

  

Fig. 1 :PROFbval identifies active sites to be rigid. (A)ROFbval predicted normalized B-values plotted with window size 3 for RNase HI. The active site residues (D10, E48 and D70) are marked by arrows. (B) The predicted values were mapped onto RNase HI X-ray structure (PDB identifier 2RN2 [27] ). Residues colored in blue mark lower predicted B-values (more rigid). Despite being relatively exposed to the surface, the active site residues were predicted to be rigid. Specifically, E48 was predicted to be the most rigid of all exposed residues of the protein (>5%) [23] .

Example of PROFbval application. Residue mobility and solvent accessibility are highly correlated. Interestingly, surface residues in enzymes that are rigid often have a functional role [21, 23] . PROFbval results for RNaseH indicated that the active site residues were predicted to be rigid despite being relatively exposed ( Fig. 1 ).

 

Acknowledgements

Thanks to Jinfeng Liu (Columbia) for computer assistance, to Andrew Kernytsky (Columbia) for valuable suggestions, and to Henry Bigelow for preliminary information and programs and for helpful comments on the manuscript. (Columbia). Last, not least, thanks to all those who deposit their experimental data in public databases, and to those who maintain these databases. The work was supported by the grants RO1-GM64633-01 from the National Institutes of Health (NIH) and R01-LM07329-01 from the National Library of Medicine (NLM).

 

References

1.Fulop, V., Jones, D.T. (1999). Beta propellers:structural rigidity and functional diversity. Curr. Opin. Str. Biol., 9, 715-721.
2.Tainer, J. A., Getzoff, E. D., Alexander, H., Houghten,R. A., Olson, A. J. et al. (1984). The reactivity of anti-peptide antibodies isa function of the atomic mobility of sites in a protein. Nature, 312, 127-134.
3.Demchenko, A. P. (2001). Recognition between flexibleprotein molecules: induced and assisted folding. J Mol Recognit, 14, 42-61.
4.Palmer, A. G., 3rd (2001). Nmr probes of moleculardynamics: overview and comparison with other techniques. Annu. Rev. Biophys.Biomol. Struct., 30, 129-55.
5.Dunker, A. K., Brown, C. J., Lawson, J. D., Iakoucheva,L. M. & Obradovic, Z. (2002). Intrinsic disorder and protein function. Biochem., 41, 6573-82.
6.Liu, J., Tan, H. & Rost, B. (2002). Loopy proteinsappear conserved in evolution. J. Mol. Biol., 322, 53-64.
7.Dyson, H. J. & Wright, P. E. (2005). Intrinsicallyunstructured proteins and their functions. Nature Reviews Molecular CellBiology, 6, 197-208.
8.Tompa, P. (2005). The interplay between structure andfunction in intrinsically unstructured proteins. FEBS Lett., 579, 3346-54.
9.Sprang, S. R. (1997). G proteins, effectors and GAPs:structure and mechanism. Curr. Opin. Str. Biol., 7, 849-856.
10.Huang, Y. J. & Montelione, G. T. (2005). Structuralbiology: proteins flex to function. Nature,438, 36-7.
11.Eisenmesser, E. Z., Millet, O., Labeikovsky, W.,Korzhnev, D. M., Wolf-Watz, M. et al. (2005). Intrinsic dynamics of an enzymeunderlies catalysis. Nature, 438,117-21.
12.Wright, P. E. & Dyson, H. J. (1999). Intrinsicallyunstructured proteins: re-assessing the protein structure-function paradigm. J.Mol. Biol., 293, 321-331.
13.Dunker, A. K. & Obradovic, Z. (2001). The proteintrinity-linking function and disorder. Nat. Biotechnol., 19, 805-806.
14.Fuxreiter, M., Simon, I., Friedrich, P. & Tompa, P.(2004). Preformed structural elements feature in partner recognition byintrinsically unstructured proteins. J. Mol. Biol., 338, 1015-26.
15.Karplus, P. A. & Schultz, G. E. (1985). Predictionof chain flexibility of peptide antigens. Naturwissenchaften, 72, 212-213.
16.Vihinen, M., Torkkila, E., Riikonen, P. (1994).Accuracy of protein flexibility predictions. Proteins:structure function andgenetics, 19, 141-149.
17.Sheriff, S., Hendrickson, W. A., Stenkamp, R. E.,Sieker, L. C. & Jensen, L. H. (1985). Influence of solvent accessibilityand intermolecular contacts on atomic mobilities in hemerythrins. Proc.Natl. Acad. Sci. U.S.A., 82,1104-1107.
18.Tronrud, D. E. (1996). Knowledge-based B-factorrestraints for the refinement of proteins. J. Appl. Cryst., 29, 100-104.
19.Carugo, O. & Argos, P. (1997). Correlation betweenside chain mobility and conformation in protein structures. Prot. Engin., 10, 777-787.
20.Bartlett, G. J., Porter, C. T., Borkakoti, N. &Thornton, J. M. (2002). Analysis of catalytic residues in enzyme active sites. J.Mol. Biol., 324, 105-21.
21.Yuan, Z., Zhao, J., Wang, Z.X. (2003). Flexibilityanalysis of enzyme active sites by crystallographic temperature factors. ProteinEng., 16, 109-114.
22.Wang, C., Karpowich, N., Hunt, J. F., Rance, M. &Palmer, A. G. (2004). Dynamics of ATP-binding cassette contribute to allostericcontrol, nucleotide binding and energy transduction in ABC transporters. J.Mol. Biol., 342, 525-37.
23.Schlessinger, A. & Rost, B. (2005). Proteinflexibility and rigidity predicted from sequence. Proteins, 61, 115-126.
24.Radivojac, P., Obradovic, Z., Smith, D. K., Zhu, G.,Vucetic, S. et al. (2004). Protein flexibility and intrinsic disorder. Prot.Sci., 13, 71-80.
25.Yuan, Z., Bailey, T. L. & Teasdale, R. D. (2005).Prediction of protein B-factor profiles. Proteins, 58, 905-12.
26.Rost, B. (2005). How to use protein 1D structurepredicted by PROFphd. In The Proteomics Protocols Handbook (Walker, J. E.,eds.), pp. 875-901, Humana, Totowa NJ.
27.Katayanagi, K., Miyagawa, M., Matsushima, M., Ishikawa,M., Kanaya, S. et al. (1992). Structural details of ribonuclease H fromEscherichia coli as refined to an atomic resolution. J. Mol. Biol., 223, 1029-52.  REFERENCES END 

Contact:    admin@rostlab.org Version:    Mar 29, 2007
 top - TOC - CUBIC-papers - CUBIC - Rost group