-
T. Hamp and B. Rost (2015): Combining classifiers to predict interacting residues in protein-protein interfaces. [in preparation].
-
T. Hamp and B. Rost (2014): Better predicting protein-protein interactions from sequence. Bioinformatics, accepted.
-
T. Hamp and B. Rost (2014): More challenges for machine learning protein-protein interactions. Bioinformatics, in press.
-
T. Goldberg, M. Hecht, T. Hamp, B. Rost and others (2014): LocTree3 prediction of localization. Nucleic Acids Research, 42 (W1): W350-W355.
-
G. Yachdav, E. Kloppmann, T. Hamp, B. Rost and others (2014): PredictProtein—an open resource for online prediction of protein structural and functional features. Nucleic Acids Research, 42 (W1), W337-W343.
-
T. Hamp (2014): Sequence Based Prediction of Protein-Protein Interactions. [PhD thesis]
-
T. Hamp and B. Rost (2013): Protein-protein interactions. International Innovations (12/2013).
-
T. Hamp, T. Goldberg and B. Rost (2013): Accelerating the Original Profile Kernel. PLoS One, 8(4), e68459.
-
P. Radivojac, W. T. Clark, T. Hamp, B. Rost and others (2013): A Large-scale Evaluation of Computational Protein Function Prediction. Nature Methods, 10(3), 221-227.
-
T. Hamp, B. Rost and others (2013): Homology-based Inference Sets the Bar High for Protein Function Prediction. BMC Bioinformatics (CAFA 2011), 14 Suppl 3, S7.
-
L. Zhao, S. C. H. Hoi, L. Wong, T. Hamp and J. Li (2012): Structural and Functional Analysis of Multi-Interface Domains. PLoS One, 7(12), e50821.
-
T. Goldberg1, T. Hamp1 and B. Rost (2012): LocTree2 Predicts Localization for All Domains of Life. Bioinformatics (ECCB 2012), 28(18), i458-i465.
-
T. Hamp and B. Rost (2012): Alternative Protein-Protein Interfaces are Frequent Exceptions. PLoS Computational Biology, 8(8), e1002623.
-
T. Hamp, F. Birzele, F. Buchwald and S. Kramer (2011): Model-Based Learning for All SCOP Families. Highlights Track Paper ISMB/ECCB 2011.
-
T. Hamp, F. Birzele, F. Buchwald and S. Kramer (2011): Improving Structure Alignment-based Prediction of SCOP Families Using Vorolign Kernels. Bioinformatics, 27(2), 204-210.
-
T. Hamp (2009): Structure Alignment Based Classification Models of Proteins. Diploma thesis.
-
Metastudent
-
Loctree2/3
In my diploma thesis, I evaluated state-of-the-art sequence-based classification methods for proteins, including the profile kernel and a multi-class scheme called 'Nested Dichotomies' (NDs). In T. Goldberg's master thesis, which I supervised, this methodological knowledge was joined with the biological expertise established in the Rostlab (e.g. by LocTree). I provided implementations of generic multi-class classifiers using Support Vector Machines and designed a multi-layered cross-validation and parameter optimization. T. Goldberg adjusted the NDs and executed and evaluated the experiments. As a result, LocTree2 is now arguably the best method for predicting the subcellular localization of proteins without known annotated homologs. LocTree3 also covers the space of close homologs by using BLAST for classification if such a homolog is available and LocTree2 for more difficult cases.
-
Fastprofkernel
The profile kernel developed by the Leslie group and used, e.g., in LocTree2/3 achieved state-of-the-art accuracies, but was very slow in practice. I could accelerate it by orders of magnitudes with various algorithmical and technical modifications. For example, I reduced updates of the kernel matrix to matrix multiplication, implemented a cache-efficient sparse-sparse matrix multiplication algorithm, exchanged key operations implemented in standard C with SSE2 instructions and bit-parallelized the prediction of many sequences with many SVMs at the same time. This accelerated version of the profile kernel is called 'fastprofkernel' and available as a Debian package. It also includes an easy-to-use workflow script that reduces the complexity of training a profile kernel based multi-class classifier to a single program call.
-
Uniqueprot2
Uniqueprot used the HVAL to reduce sequence redundancy in a set of protein sequences. Uniqueprot2 is my re-implementation in Python that fixes bugs and introduces a number of new features. For example, it can now handle much more sequences (before: ~20,000; now: ~500,000), run on multiple CPUs in parallel and incorporate sequence weights through a hill-climbing method.
-
ISIS2
ISIS predicts from sequence which amino acids in a protein can bind other proteins. It encodes each amino acid as a vector of features such as amino acid conservation frequencies found by PSI-BLAST and predicted secondary structure. Neural networks then determine whether a query residue can bind other proteins or not. Under my supervision, the computer science students A. Jha and V. Ravindra implemented ISIS2. It is trained in the same way as ISIS, but on a larger and qualitatively higher data set that we recently extracted from the PDB. More computational resources, a faster neural network implementation and new methods predicting other aspects of amino acids (e.g. disorder) further improved its accuracy.
-
RostDB (maintenance)
RostDB is a collection of local copies of molecular biology databases and comprises several terabytes of data. It features automatic updates, a database versioning system, minimal data redundancy and data distribution to node local disks. I was involved in its development and ensured smooth operation through the exponential data growth of recent years.
-
Uga Agga web tools
Everything you need for your team to dominate in Uga Agga, the once-quite-popular browser game.