AutoGrow

  Introduction     System Requirements     AutoGrow Run Modes     AutoGrow Configuration     FAQs  

Introduction

Due in part to the increasing availability of crystallographic protein structures as well as rapid improvements in computer power, the past few decades have seen an explosion in the field of computer-based rational drug design. Several algorithms have been developed to identify or generate potential ligands in silico by optimizing the ligand-receptor hydrogen bond, electrostatic, and hydrophobic interactions. The novel drug-design algorithm described on this page, called AutoGrow (Java DOCK), uses fragment-based growing, docking, and evolutionary techniques.

If you use AutoGrow in your work, please cite Durrant JD, Amaro RE, McCammon JA: AutoGrow: A Novel Algorithm for Protein Inhibitor Design. Chemical Biology & Drug Design (2009) 73(2):168-178.

Evolutionary algorithms are ideally suited to complex problems like those associated with de novo drug discovery. These algorithms typically include three operators, modeled on the three natural operators of biological evolution: selection, mutation, and crossover. The evolutionary procedure is divided into generations, where each generation consists of a population of individuals derived from selection of the most fit members of the previous generation. The internal variation of each generation is exploited via crossover, wherein the characteristics of two "parent” individuals are combined to create a new "child” individual. External variation is introduced into each generation via mutation, wherein new individuals are created by making small, usually random changes to individuals already present in the population. As generation after generation is created, each based on the most fit individuals of the previous generation as well as additional individuals derived by exploiting internal and external variation, a solution eventually evolves. In our implementation, each generation consists of multiple potential ligands. The algorithm "evolves” ligands that are predicted to bind to a given target protein with high affinity.

AutoGrow uses AutoDock as the selection operator. For each generation, all ligand files are docked to the target protein, and for each dock, AutoDock returns a predicted binding affinity. Those ligands that bind within the active site and have the most favorable predicted binding affinities are selected for inclusion in the next generation. AutoDock is an excellent selection operator; it takes into account full ligand flexibility, has a well-tested scoring function, and returns an actual binding-energy prediction, as opposed to a program-specific score.

In order to introduce external variation into each generation, a new mutation operator is used. First, a number of individuals are selected from among those that were the most fit of the previous generation. For each of these individuals, an appropriate hydrogen atom is randomly selected and replaced with a fragment from a fragment library. Thus, "mutants” are similar to, but distinct from, other population members. In order to exploit the internal variation present in each generation after the first, a new crossover operator is used. First, a number of individual pairs ("couples”) are selected from among those ligands that were the most fit of the previous generation. A new hybrid "child” ligand is then formed by randomly mixing and matching the attached moieties of the two "parents."

AutoGrow Tutorial

Click to download a copy of AutoGrow 2.0.4.




System Requirements

AutoGrow 2.0.4 is designed to work on UNIX-based systems (Linux, Mac OsX, etc). As AutoGrow is written in Java, it also requires that the Java Virtual Machine be installed. Because AutoGrow docks multiple ligands using AutoDock, we recommend running AutoGrow on a multi-processor computer cluster. AutoGrow 2.0.4 uses AutoDock Vinaand can be run on a single machine, preferably one with multiple processors.




AutoGrow 2.0.4 Run Modes

AutoGrow Help File

When AutoGrow is run without any parameters, the following help file is displayed:

AutoGrow 2.0.4

No parameters passed to AutoGrow. Parameters are:
	-run_mode 
	-parm_file 
	[-output_file ]
	[-generation ]

Note: If no output_file parameter is passed, output is written to the consul.
Note: 'parm_file' is required for all 'run_mode' except 'Install,' which takes its
      parameters from the keyboard.

Examples:
	java Main -run_mode Execute -parm_file default.prm
	java Main -run_mode Install
	java Main -run_mode ExtractBest -parm_file default.prm -generation 2

Note the various "run modes." The remainder of this page is dedicated to explaining each of these modes.

Install Run Mode

Once the AutoGrow ZIP file as been downloaded and uncompressed, several directories are initially created. These include the bin, source, and fragment directories. The bin directory contains the AutoGrow executable files, the source directory contains the AutoGrow source code, and the fragment directory contains two default fragment libraries. Without further installation, however, AutoGrow will not run.

To install AutoGrow, run AutoGrow under the Install run mode by changing to the bin directory and running "java Main -run_mode Install" from the command line, where "java" is the path to the java executable. When running in Install mode, AutoGrow requests information about your system, including the path to the AutoDock Vina executable, the path to the AutoDock preparation python scripts, etc. When running AutoGrow in Install mode, all required files and directories are created so that AutoGrow can run properly. Additionally, a sample parameter file ("default.prm") is created in the bin directory to assist the user in preparing future AutoGrow jobs. (Click to see an example of this install-generated default.ini file.) Note that Install mode is the only AutoGrow mode that does not require a parameter file.

Example: java Main -run_mode Install

Execute Run Mode

When running in "Execute mode,” AutoGrow creates new ligands (mutants and crossovers), automatically generates AutoDock input files, and manages and organizes the computer files associated with each generation. When all AutoDock input files have been created and organized, AutoGrow automatically executes AutoDock Vina to dock each of the novel ligands of each generation.

Example: java Main -run_mode MasterNode -parm_file default.prm -output_file output_Main.txt

ExtractBest Run Mode

During the course of its execution, AutoGrow creates many files. If the user is only interested in the final results (as is usually the case), AutoGrow can be run in the ExtractBest mode, wherein the best results from a given generation are placed in a separate directory for easy viewing with a visualization program like VMD.

Example: java Main -run_mode ExtractBest -parm_file default.prm -generation 2




AutoGrow Configuration

The specific behavior of AutoGrow is governed by the user-adjustable program variables defined in the user-input file (bin/default.prm by default). (Click to see an example of a default.prm file.) First, the user must specify the location of several important directories. The “working root directory” variable indicates the directory where AutoGrow will place the AutoDock input and output files for each generation of the evolutionary algorithm. An output file called BestLigands.log in also created in the working root directory; this file contains the AutoDock scores of the top ligands of each generation. The default location of the working root directory is run_dir.

The “fragments directory” variable indicates the directory location of the fragment library used to create “mutant” ligands. The PDB fragments in this directory are added to the evolving ligand in order to improve hydrogen bond, electrostatic, and hydrophobic interactions. AutoGrow comes with two default fragment libraries. The default location of the fragments directory is fragment/large_fragment.

The “scripts directory” variable indicates the directory containing scripts that allow AutoGrow to interface with AutoDocki Vina and the ADT python scripts. By default, this variable points to the scripts directory.

The user must also identify the location of both initial-ligand and protein-receptor PDB files. The “initial ligand” variable contains the absolute path to the PDB file of the initial “core” scaffold, and the “receptor” variable contains the absolute path to the PDB file of the protein receptor.

AutoGrow makes calls to AutoDock Vina to dock the newly created ligands. The variables "autodock grid center" and "autodock box size" specify the size and the location of a cube encompassing the receptor active site.

The user must also specify the parameters of the AutoGrow evolutionary algorithm itself. The variable “number of carryovers” specifies the number of best-fit ligands from each generation that become the founding members of the next generation. The variables “number of children” and “number of mutants” specify the number of “children” and “mutant” ligands derived from those founders via the “crossover” and “mutation” operator, respectively.

The user can place a number of constraints on the growing ligands. The variable “number of generations” specifies the maximum number of generations for which AutoGrow will run; AutoGrow termination prevents ligands from growing too large. The “max number atoms” variable is likewise used to prevent AutoGrow ligands from growing too large; AutoGrow will not create ligands that exceed the number of atoms specified. The user can also require that the evolving ligands dock within a specific active site. The “receptor location” and “receptor radius” variables are used to specify the location and size of the active site. Additionally, the “indices of hydrogens that are not linkers” variable contains the indices of scaffold hydrogens to which AutoGrow will not add fragments.

See a sample AutoGrow user-input file...

Sample AutoGrow User-Input File

//DIRECTORIES
working root directory: /u1/AutoGrow/run_dir
fragments directory: /u1/AutoGrow/fragment/large_fragment
scripts directory: /u1/AutoGrow/scripts

//INPUT FILES
initial ligand: /u1/AutoGrow/models/nsc16209.pdb
receptor: /u1/AutoGrow/models/1XDN.pdb

//AUTODOCK PARAMETERS
autodock grid center: 39.000 21.500 13.500
autodock box size: 15 15 15

//EVOLUTION PARAMETERS
number of carryovers: 10
number of children: 20
number of mutants: 20
max number atoms: 500
receptor location: 38.754 26.297 9.154
receptor radius: 10
indices of hydrogens that are not linkers: -1
number of generations: 8




FAQ

Q: What if AutoGrow terminates early? Do I have to start over?

A: No. Just restart the AutoGrow program using the Execute run mode. AutoGrow will look in the run_dir directory, where the files of each generation are stored, and try to pick up where it left off before being terminated.

Q: I've completed a AutoGrow run and now want to start a new run. How can I prevent AutoGrow from trying to resume an old job?

A: You must move all the run_dir/generation* directories to another location, perhaps a backup directory. If you wish to begin with a new target protein, the run_dir/receptor directory must also be moved. If AutoGrow sees that the run_dir directory contains no subdirectories that start with "generation," it will assume you are starting a new job.

Q: What's the format and organization of the AutoGrow output?

A: There are many output files. Here's a summary:
(Click on the appropriate files to see an example of each.)

  1. bin/output_Main.txt: By default, AutoGrow writes its execution output to this file when AutoGrow jobs are launched using the Execute mode. AutoGrow never deletes this file; it should be deleted or moved before beginning each AutoGrow run, or the results of multiple runs will be included in the same file.
  2. run_dir/generation*: The AutoGrow and AutoDock files of each generation are stored in these directories.
  3. run_dir/generation*/ligand*.vina: These are the AutoDock Vina executable scripts, one for each of the ligands of each generation.
  4. run_dir/generation*/ligand1*.vina.out: These are the AutoDock Vina output files, one for each of the ligands of each generation.
  5. run_dir/generation*/ligand*.vina.out.best.pdb: This is a PDB file containing the best result extracted from the corresponding AutoDock docking log file.
  6. run_dir/BestLigands.log: After each generation, the names of the best ligands are written to this log file, together with their corresponding AutoDock-predicted binding energies. This file can be a useful guide in navigating the results of the run_dir/generation* output files.
  7. run_dir/BestResults.#########: If you run AutoGrow in ExtractBest mode, the results will be stored in a directory that looks like this. Both a PDB of the receptor, as well as PDBs of the best ligands, are copied to this directory for easy visualization with a program like VMD.