FreeContact: fast and free software for protein contact prediction from residue co-evolution.

TitleFreeContact: fast and free software for protein contact prediction from residue co-evolution.
Publication TypeJournal Article
Year of Publication2014
AuthorsKaján, L, Hopf, TA, Kalaš, M, Marks, DS, Rost, B
JournalBMC Bioinformatics
Volume15
Pagination85
Date Published2014
ISSN1471-2105
KeywordsAlgorithms, Computational Biology, Protein Conformation, Proteins, Sequence Analysis, Protein, Software
Abstract

BACKGROUND: 20 years of improved technology and growing sequences now renders residue-residue contact constraints in large protein families through correlated mutations accurate enough to drive de novo predictions of protein three-dimensional structure. The method EVfold broke new ground using mean-field Direct Coupling Analysis (EVfold-mfDCA); the method PSICOV applied a related concept by estimating a sparse inverse covariance matrix. Both methods (EVfold-mfDCA and PSICOV) are publicly available, but both require too much CPU time for interactive applications. On top, EVfold-mfDCA depends on proprietary software.

RESULTS: Here, we present FreeContact, a fast, open source implementation of EVfold-mfDCA and PSICOV. On a test set of 140 proteins, FreeContact was almost eight times faster than PSICOV without decreasing prediction performance. The EVfold-mfDCA implementation of FreeContact was over 220 times faster than PSICOV with negligible performance decrease. EVfold-mfDCA was unavailable for testing due to its dependency on proprietary software. FreeContact is implemented as the free C++ library "libfreecontact", complete with command line tool "freecontact", as well as Perl and Python modules. All components are available as Debian packages. FreeContact supports the BioXSD format for interoperability.

CONCLUSIONS: FreeContact provides the opportunity to compute reliable contact predictions in any environment (desktop or cloud).

DOI10.1186/1471-2105-15-85
Alternate JournalBMC Bioinformatics
PubMed ID24669753
PubMed Central IDPMC3987048
Grant ListR01 GM106303 / GM / NIGMS NIH HHS / United States