Prediction of protein-protein interaction sites from weakly homologous template structures using meta-threading and machine learning

TitlePrediction of protein-protein interaction sites from weakly homologous template structures using meta-threading and machine learning
Publication TypeJournal Article
Year of Publication2015
AuthorsMaheshwari S, Brylinski M
JournalJ Mol Recognit
Date Published2015 Jan

The identification of protein-protein interactions is vital for understanding protein function, elucidating interaction mechanisms, and for practical applications in drug discovery. With the exponentially growing protein sequence data, fully automated computational methods that predict interactions between proteins are becoming essential components of system-level function inference. A thorough analysis of protein complex structures demonstrated that binding site locations as well as the interfacial geometry are highly conserved across evolutionarily related proteins. Because the conformational space of protein-protein interactions is highly covered by experimental structures, sensitive protein threading techniques can be used to identify suitable templates for the accurate prediction of interfacial residues. Toward this goal, we developed eFindSite(PPI) , an algorithm that uses the three-dimensional structure of a target protein, evolutionarily remotely related templates and machine learning techniques to predict binding residues. Using crystal structures, the average sensitivity (specificity) of eFindSite(PPI) in interfacial residue prediction is 0.46 (0.92). For weakly homologous protein models, these values only slightly decrease to 0.40-0.43 (0.91-0.92) demonstrating that eFindSite(PPI) performs well not only using experimental data but also tolerates structural imperfections in computer-generated structures. In addition, eFindSite(PPI) detects specific molecular interactions at the interface; for instance, it correctly predicts approximately one half of hydrogen bonds and aromatic interactions, as well as one third of salt bridges and hydrophobic contacts. Comparative benchmarks against several dimer datasets show that eFindSite(PPI) outperforms other methods for protein-binding residue prediction. It also features a carefully tuned confidence estimation system, which is particularly useful in large-scale applications using raw genomic data. eFindSite(PPI) is freely available to the academic community at Copyright © 2014 John Wiley & Sons, Ltd.

Alternate JournalJournal of Molecular Recognition
Full Text


2015_jmr.pdf3.23 MB

© Michal Brylinski
This website is hosted at the CCT