Evolution teaches to predict protein structure and function

9/12/01


Click here to start


Table of Contents

Evolution teaches to predict protein structure and function

Evolution teaches prediction

http://cubic.bioc.columbia.edu/

CUBIC http://cubic.bioc.columbia.edu

The Data Deluge

Data Deluge: what do we want?

Data Deluge: numbers

Data Deluge: what CAN we do?

Data Deluge: we CAN we do?

Evolution teaches prediction

Dynamic programming: optimal alignment

BLAST: fast matching of single words

Profile-based comparison

Zones

Sequence -> Structure

Sequence -> Structure

Sequence -> Structure

Twilight zone = false positives explode

Significant sequence identity

Evolution did it !

Similar sequence -> similar structure?

Detecting true hits in Twilight zone

Finding similar structures in Twilight zone

Secure thresholds for BLAST

Accuracy vs. coverage

BLAST is not enough ...

Sequence Space Hopping

Success through sequence space hopping

Zones

Profile-based database search

Profile-based database search

Profile-based database search

Profile-based database search

Profile-based database search

Profile-based database search

Zones

Hypothetical distribution of similar structures

PPT Slide

Midnight zone: real - random

Evolution into the Midnight zone

Protein structures evolved at random - almost

Structure space

Gold-mine out of reach!

Conservation of function

Conservation of EC number

Conservation of EC number 2

Conservation of EC number: BLAST

Conservation in detail

Accuracy vs. coverage: EC number

Conservation of EC numbers

Evolution teaches prediction

Notation: protein structure 1D, 2D, 3D

PPT Slide

PPT Slide

Goal of structure prediction

Protein structure prediction in reality

PPT Slide

Homology modelling/comparative modelling

Protein structure prediction in reality

Protein structure prediction in reality

Structure prediction for protein universe

Improving prediction by waiting it out

Evolution teaches prediction

Evolution did it !

PPT Slide

PPT Slide

PPT Slide

Evolution teaches prediction

Membrane prediction

HTM prediction waiting for database growth ...

Topology for membrane helical proteins

PHDsec success on Poly-Valine

PPT Slide

Refine by dynamic programming on NN energy

PHDhtm refine topology prediction

PHDhtm on Poly-Valine

Example IS representative

To be or not to be (HTM)

False positives: globular proteins

Details PHDsec: Wrong alignment

Details PHDhtm: wrong for save alignment

Details PHDhtm: correct for accurate alignment

Evolution teaches prediction

Defining residue solvent accessibility

PPT Slide

Evolution for accessibility prediction

PHDacc: the un-g(l)ory details

Evolution teaches prediction

Evolution teaches prediction

PPT Slide

PPT Slide

PPT Slide

PPT Slide

PPT Slide

Shuttle into the nucleus

How many NLS motifs in databases?

Experimental NLS: positive charges

Experimental NLS: more complicated

In silico mutagenisis

Increasing accuracy and coverage

Increasing accuracy and coverage

Increasing accuracy and coverage

Increasing accuracy and coverage

Increasing accuracy and coverage

Nuclear protein in proteomes

Un-annotated nuclear proteins with NLS

Using NLS to bind DNA

DNA-binding predictions in proteomes

Rotation @ CUBIC.bioc.columbia.edu

Significant motifs

Rotation @ CUBIC.bioc.columbia.edu

Finding unique subsets of proteins

Similar sequence -> similar structure?

Rotation @ CUBIC.bioc.columbia.edu

Retention signals in ER and Golgi

Evolution teaches prediction

PPT Slide

Family size

Structure prediction for protein universe

Do we aim at getting one structure per fold?

Similar amino acid composition

Inventory of life: membrane proteins

Number of membrane helices -> complexity?

Membrane proteins: kingdoms invented different tricks

The membrane LEGO

Length of globular regions in membrane proteins

Inventory of life: coiled-coil proteins

Coiled-coil proteins: details

Inventory of life: compartments

Protein structure universe

Distribution of protein length

Bottleneck 5: money ...

What will we get?

Recipe to determine targets

Alternative recipe to determine targets

Reality check: the invaluable contribution of bioinformatics to target selection

Target selection

Priority classes

Target selection machinery

Conclusions: Structural Genomics

Evolution teaches prediction

Midnight zone STRONGLY populated

What we are threading for

Goals of fold recognition, threading, remote homology modelling

Two paths to fold recognition

TOPITS

Prediction-based threading

Example of remote sequence identity

30% correct first, better if stronger

Other threading methods

Evolution teaches prediction

Long floppy regions

Floppy loops between domains

Floppy ends

Floppy-wrap

Weirdoes

Weirdoes are not alone !

10% of biomass weird !

Length distribution of floppy regions

Weirdoes functional !

Yeast-2-hybrid interactions

Evolution teaches prediction

Conclusions

Thanksgiving

Availability of methods

Author: Burkhard Rost

Email: rost@columbia.edu

Home Page: http://cubic.bioc.columbia.edu

Download presentation source

Download presentation handouts (2 slides/page) PDF

Download presentation handouts (6 slides/page) PDF