Hauptseminar WS 2010/2011

Type:

Seminar (2 SWS)

Ects:

4.0

Lecturer:

Burkhard Rost 

Rotation:

Monday, 12:30 - 14:00

Place:

Seminar room 'John von Neumann'  MI  00.11.038

Language:

English

Pre-meeting:

Thursday, July 29th, 11 am; Room 01.09.034

Topic List

Date

Student

Topic

Advisor

18.10.

Sebastian Kopetzky

Predicting Protein Disorders with MD (Meta-Disorder predictor)

Dipl.-Biol. Esmeralda Vicedo

25.10.

Vadim Jördens

Comparing complete genomes: Census of protein structures and intrinsic disorders (Topics I and II)

Dr. Arthur Dong

8.11.

Michael Bernhofer

Comparing complete genomes: Census of protein structures and intrinsic disorders (Topics I and II)

Dr. Arthur Dong

15.11.

Ulrich Neumaier

Structural Systems Biology

Dr. Shaila C. Rössle

22.11.

Jan Brusis

Protein secretion in Bacteria and prediction of signal peptides

Dr. Marco Punta

29.11.

Jonas Reeb

Understanding and Predicting Bacterial lipoproteins

Dr. Marco Punta

6.12.

Michael Kluge

Improving pairwise alignment scores: Constructing custom and well-defined sequence similarity measures.

Dipl.-Bioinf. Tobias Hamp

13.12.

Maximilian Hastreiter

Combining structure and sequence information for multiple sequence alignments

Dr. Andrea Schafferhans

20.12.

Manuel Kanditt

Sequence profiles and their implementation in PSSM and PSIC

Dipl.-Bioinf. Christian Schaefer

10.1.

Anja Mösch

Homology modelling for protein structure prediction

Dr. Shaila C. Rössle

17.1.

Tobias Sander

Using GPUs for sequence alignments

Dr. Markus Schmidberger

24.1.

Melanie Schneider

Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure

Dr. Andrea Schafferhans

31.1.

Oliver Hilsenbeck

Software Engineering in Bioinformatics

Dr. Markus Schmidberger

Topic Details

Title: Predicting Protein Disorders with MD (Meta-Disorder predictor).

Dipl.-Biol. Esmeralda Vicedo

Protein or protein regions that do no adopt well defined, stable three-dimensional (3-D) structures under physiological conditions in isolation are labeled as intrinsically disordered, unfolded or natively unstructured proteins. Different methods have been developed to predict them. MD is a neural-network based meta-predictor that uses different sources of information predominantly obtained from orthogonal approaches. MD is capable of predicting different disordered regions , and identifying new ones that are not captured by other predictors.

Literature:

  • Schlessinger A, Punta M, Yachdav G, Kajan L, Rost B, 2009 Improved Disorder Prediction by Combination of Orthogonal Approaches. PLoS ONE 4(2)
  • Dyson HJ, Wright PE. Intrinsically unstructured proteins and their functions. Nat Rev Mol Cell Biol. 2005 Mar;6(3):197-208.

 

Comparing complete genomes: Census of protein structures and intrinsic disorders (Topics I and II)

Dr. Arthur Dong

Overview:

Traditional sequence analysis focuses on string matching and motif detection, and much of the research centers around developing and refining algorithms and methods. This portion of the seminar shifts from developing methods to asking biological questions, and shows how sequence analysis can play a role in systems biology. In a few representative papers, the authors employed sequence analysis to detect protein structures and intrinsic disorders in complete genomes, and then compared those structural elements across genomes to gain system-level biological insight.

Topic I: Protein secondary structures and folds in complete genomes

Literature:

  • Gerstein M. A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure. J Mol Biol. 1997 Dec 12;274(4):562-76. PubMed PMID: 9417935
  • Lin J, Gerstein M. Whole-genome trees based on the occurrence of folds and orthologs: implications for comparing genomes on different levels. Genome Res. 2000 Jun;10(6):808-18. PubMed PMID: 10854412

Topic II: Protein intrinsic disorders in complete genomes

Literature:

  • Dunker AK, Obradovic Z, Romero P, Garner EC, Brown CJ. Intrinsic protein disorder in complete genomes. Genome Inform Ser Workshop Genome Inform. 2000;11:161-71. PubMed PMID: 11700597
  • Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT. Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol. 2004 Mar 26;337(3):635-45. PubMed PMID: 15019783

Structural Systems Biology

Dr. Shaila C. Rössle

Structural systems biology is based on the ability to understand the complexity of biology beginning with genome sequences and other sources of high throughput data including global experimental strategies coupled with a detailed understanding of the structure and behavior of proteins individually and in complexes.

Literature:

 

 

Improving pairwise alignment scores: Constructing custom and well-defined sequence similarity measures

Dipl.-Bioinf. Tobias Hamp

 

Plain pairwise alignment scores obtained for example by the well-known dynamic programming algorithms usually lack certain properties that would make them applicable for use in clustering or classification tasks. This seminar is supposed to give an introduction to how they can be improved and extended, together with a few cases of applications like e.g. the prediction of subcellular localization and remote homology detection.

Literature:

  • Christina S. Leslie, Eleazar Eskin, Adiel Cohen, Jason Weston, and William Stafford Noble: Mismatch string kernels for discriminative protein classification. Bioinformatics 2004 20: 467-476.
  • Sulimova, Valentina and Mottl, Vadim and Mirkin, Boris and Muchnik, Ilya and Kulikowski, Casimir: A Class of Evolution-Based Kernels for Protein Homology Analysis: A Generalization of the PAM Model. ISBRA '09: Proceedings of the 5th International Symposium on Bioinformatics Research and Applications
  • Mak, Man-Wai and Guo, Jian and Kung, Sun-Yuan: PairProSVM: Protein Subcellular Localization Based on Local Pairwise Profile Alignment and SVM. IEEE/ACM Trans. Comput. Biol. Bioinformatics (2008)

 

 

Combining structure and sequence information for multiple sequence alignments

Dr. Andrea Schafferhans

Structural information can help to guide multiple sequence alignments and to identify important residues within the alignment. The program T-Coffee and its extension 3DCoffee can be used to achieve such a combination. An example for the usage of such an alignment is the prediction of substrate specificities within enzyme families.

Literature:

  • Notredame C, Higgins DG, Heringa J: T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of molecular biology 2000, 302:205-17. http://www.ncbi.nlm.nih.gov/pubmed/10964570
  • O'Sullivan O, Suhre K, Abergel C, Higgins DG, Notredame C: 3DCoffee: combining protein sequences and structures within multiple sequence alignments. Journal of molecular biology 2004, 340:385-95.http://www.ncbi.nlm.nih.gov/pubmed/15201059
  • Röttig M, Rausch C, Kohlbacher O: Combining structure and sequence information allows automated prediction of substrate specificities within enzyme families. PLoS computational biology 2010, 6:e1000636.http://www.ncbi.nlm.nih.gov/pubmed/20072606

 

 

Sequence profiles and their implementation in PSSM and PSIC

Dipl.-Bioinf. Christian Schaefer

Sequence profiles or position specific scoring matrices provide position-specific representations of sequence families. They come to use for example in sequence database searches where even distantly related sequences are of interest. In this talk, two concepts should be introduced: PSSM (position-specific scoring matrix) and PSIC (position-specific independent counts).

Literature:

  • Henikoff JG, Henikoff S., Using substitution probabilities to improve position-specific scoring matrices., Comput Appl Biosci. 1996 Apr;12(2):135-43
  • Sunyaev SR et al., PSIC: profile extraction from sequence alignments with position-specific counts of independent observations., Protein Eng. 1999 May;12(5):387-94.

 

 

Hidden Markov Models and the protein family database Pfam

Dipl.-Bioinf. Christian Schaefer

The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). In this seminar the theoretical background behind HMMs and their role in fast database searches should be presented as well as their application in Pfam.

Literature:

  • Sonnhammer EL, Eddy SR, Durbin R. Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins. 1997 Jul;28(3):405-20.
  • Anders Krogh, Michael Brown, I. Saira Mian, Kimmen Sjolander, David Haussler. Hidden Markov Models in Computational Biology: Applications to Protein Modeling. J Mol Biol 1994, 235, 1501-1531

 

 

Homology modelling for protein structure prediction

Dr. Shaila C. Rössle

The genome sequencing revolution has resulted in a dramatic increase in the demand for structural information, Structural information often greatly enhances our understanding of how proteins function and how they interact with each other. In the absence of an experimentally determined structure, comparative or homology modelling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Homology model.ing predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates).

Literature:

  • Goldsmith-Fischman S, Honig B.: Structural genomics: computational methods for structure analysis; Protein Sci. 2003 Sep;12(9):1813-21. PMID: 12930981
  • Petrey D, Honig B.: Protein structure prediction: inroads to biology. Mol Cell. 2005 Dec 22;20(6):811-9. PMID: 16364908

 

 

Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure

Dr. Andrea Schafferhans

Identifying a protein’s functional sites is an important step towards characterizing its molecular function. Numerous structure- and sequence-based methods have been developed for this problem. This seminar shall introduce ConCavity, a small molecule binding site prediction algorithm that integrates evolutionary sequence conservation estimates with structure- based methods for identifying protein surface cavities.

Literature:

  • Capra Ja, Laskowski RA, Thornton JM, Singh M, Funkhouser TA: Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3D structure. PLoS computational biology 2009, 5:e1000585.http://www.ncbi.nlm.nih.gov/pubmed/19997483

 

 

Using GPUs for sequence alignments

Dr. Markus Schmidberger

GPUs (Graphic Processing Units) are one of the promising hardware solutions for the future to accelerate (Bioinformatic-) applications in a very good performance-to-cost ratio. In most sequence analysis tasks first of all some kind of sequence alignment is required. In many cases this is the most time consuming task. GPUs and the CUDA library provide a very good solution, but are limited in several tasks. 
This talk should present ideas for implementing state-of-the-art sequence alignment methods on GPUs.

Literature:

  • High-throughput sequence alignment using Graphics Processing Units; M. C. Schatz, C. Trapnell, A. L. Delcher, A. Varshney; BMC Bioinformatics, 2007
  • CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment; S. A. Manavski, G. Valle; BMC Bioinformatics, 2008

 

 

Software engineering in bioinformatics

Dr. Markus Schmidberger

Real bioinformatics requires not just an appreciation of the underlying science, but also the ability to write efficient computer programs. Software Engineering helps you to develop the required software. This presentation should demonstrate how to save time and trouble by doing the right thing, at the right time, in the right way. A focus could be the whole project life cycle for Bioinformatic applications illustrated with a realistic example.

Literature:

  • Software Engineering for Bioinformatics: Delivering Effective Application, Paul Weston, Halsted Press New York, NY, USA, 2004

 

 

Here you can find the checklist for a successful preparation in German and English versions.