Example

In this tutorial, we focus on three proteins: ATP-dependent DNA ligase (PDB-ID: 1a0iA), histamine N-methyltransferase (PDB-ID: 2aotA) and glutathione S-transferase (PDB-ID: 13gsA). The first two bind ligands that contain adenine moiety: ATP and S-adenosyl-L-homocysteine, respectively. However, they share very little similarity at the global sequence as well as structure level; their pairwise sequence identity and TM-score is 0.23 and 0.28, respectively. To make things even more difficult, we will be working with their weakly homologous models whose TM-score to crystal structures is 0.46 (1a0iA) and 0.57 (2aotA). Moreover, we will match binding sites predicted by eFindSite, which to some extent overlap with experimental pockets, but also contain inaccuracies with respect to the binding site center (off by a couple of Å) as well as the binding residue annotation (MCC ~0.7).

The last protein, 13gsA, is borrowed from eFindSite tutorial. It binds to glutathione, so its pocket is chemically different from these in the first two structures. We will be using its crystal structure, but binding residues are predicted by eFindSite. Here, we assume that eFindSite was applied to all target structures and the files are available. Note that eMatchSite requires no other input data than that generated by or used in eFindSite. Below are the target proteins and the corresponding input files:

     
Binding pockets predicted by eFindSite Input files for eMatchSite  
1a0iA

1a0iA

 
2aotA

2aotA

 
13gsA

13gsA

 

Files *.pdb, *.prf and *.ss are also input files for eFindSite and contain target structure in PDB format, sequence profiles and secondary structure profiles, respectively. Refer to eFindSite manual for instructions on how to create these files. Files *-efindsite.* are generated by eFindSite and contain various information on predicted binding sites.

 

© Michal Brylinski
This website is hosted at the CCT