Programs

The eVolver software distribution includes two programs: eprofile and evolver. To get the list of available options for each program, simply execute it with no arguments. Below is the full description of the programs, options and formats.

eprofile

eprofile constructs sequence profiles from the provided structure alignments using the following 7-state classification:

Class Amino acids  
1 A, C, I, L, M, V  
2 G, S, T  
3 D, E  
4 N, Q  
5 K, R  
6 F, P, W, Y  
7 H  

Mandatory arguments:

-i input_file, where input_file is a file that contains structure alignments of your target against a non-redundant library, e.g. CATH or SCOP.

-o output_file, where output_file will contain constructed structure-based profiles for your target.

Optional arguments:

-f format, where format should be either fasta or frtmalign. The default format is frtmalign.

-c value, where value is a significance threshold for structure alignments. The default value is set to 0.4, which corresponds to the statistical significance of the TM-score metric.

evolver

evolver is the main program that carries out Simulated Annealing Monte Carlo optimization to generate a sequence that best fits a given structure.

Mandatory arguments:

-p input_pdb, where input_pdb is a file that contains the target structure in the PDB format. It should include a single polypeptide chain.

-s input_stride, where input_stride is the output from the STRIDE program.

-r input_profiles, where input_profiles is the output from eprofile.

-o output_file, where output_file will contain the final sequence evolved for your target.

Optional arguments:

-f format, where format should be either fasta or pdb. This is how the evolved sequence will be outputted. PDB file will include only Cα coordinates. The default format is fasta.

-q ini_seq, where ini_seq is the starting sequence for optimization. Use 0, 1 or 2 to start with the native sequence read from the input PDB file, native sequence but shuffled (native composition) or a random protein-like sequence, respectively. Options 0 and 1 are useful for benchmarks, 2 is a good choice for real applications.

-v verbosity, where verbosity controls how much information will be dumped on the standard output. 0 switches off the progress from Simulated Annealing. 1 reports basic info, such as the step number, temperature and score. 2 generates full output including evolving sequences in FASTA format.

 

© Michal Brylinski
This website is hosted at the CCT