Getting started

Main Table of Contents

VERSION 3.1
Wed 27 Feb 2002


Contents

More info can be found in the flowchart (for a quick overview) and the GMX FAQ.




Introduction

In this chapter we assume the reader is familiar with Molecular Dynamics and familiar with Unix, including the use of a text editor such as emacs or vi. We furthermore assume the software is installed properly on your system. When you see a line like

ls -l

you are supposed to type the contents of that line on your computer.

We will also assume that you know where your version of GROMACS is installed. The default location is '/usr/local/gromacs', but for local users in the Groningen MD group it is simply '~gmx'. In the default case, the binaries are located in '/usr/local/gromacs//bin', where is the architecture of your computer. For other users that is probably not the case, contact your local system administrator for more information. For instance, the GROMACS RPM packages for Linux install binaries in '/usr/local/bin', and shared data in '/usr/local/share/gromacs' (according to the Linux standard).

Setting up your environment

The first step is to make sure the GROMACS binaries are in your path. On some systems they will simply be linked to /usr/local/bin and found automatically. If this is not the case on your machine you will have to edit the login file for your shell. If you use the C shell, this file is called .cshrc or .tcshrc, and located in your home directory. Add a line like:

setenv PATH "/usr/local/gromacs/bin/:${PATH}".

Issue this command at the prompt too, or log off and on again to automatically get the environment.

Examples

Before starting the examples, you have to copy all the neccesary files, to your own directory. Chdir to the directory you want to put the examples directory. This directory (named tutor) will need about 20 MB of disk space, when it is completely filled.

cd ``your own directory''

then copy the examples:

cp -r /usr/local/gromacs/share/tutor .

(NOTE: include the ``.'')
If that directory doesn't exist you could look for the files in /usr/local/share/gromacs/tutor, or ask your local GROMACS expert.
You now have a subdirectory tutor. Move there

cd tutor

and view the contents of this directory

ls -l

If all is well you will have five subdirectories with examples with names like gmxdemo, nmr1, nmr2, speptide and water.

You are encouraged to look up the different programs and file formats in the online manual while you are browsing through the examples.




The GROMACS demo

The demo is designed to demonstrate the user-friendlyness of the GROMACS software package. The only non-friendly part is that it requires the C shell to run the script. If your shell is e.g. bash (common on Linux), first start the C shell with the command 'tcsh'. To run the demo, first move to your tutor/gmxdemo directory:

cd tutor/gmxdemo

And then start the demo script:

demo

This demo handles a complete Molecular Dynamics simulation of a peptide in water, starting from a pdb structure. When you run a Molecular Dynamics simulation with GROMACS you will encounter the following file formats:

Molecular Topology file (.top)

The molecular topology file is generated by the program pdb2gmx. pdb2gmx translates a pdb structure file of any peptide or protein to a molecular topology file. This topology file contains a complete description of all the interactions in your peptide or protein.

Molecular Structure file (.gro, .pdb)

When the pdb2gmx program is executed to generate a molecular topology, it also translates the structure file (.pdb file) to a gromos structure file (.gro file). The main difference between a pdb file and a gromos file is their format and that a .gro file can also hold velocities. However, if you do not need the velocities, you can also use a pdb file in all programs. To generate a box of solvent molecules around the peptide, the program genbox is used. First the program editconf should be used to define a box of appropriate size around the molecule. genbox dissolves a solute molecule (the peptide) into any solvent (in this case water). The output of genbox is a gromos structure file of the peptide dissolved in water. The genbox program also changes the molecular topology file (generated by pdb2gmx) to add solvent to the topology.

Molecular Dynamics parameter file (.mdp)

The Molecular Dynamics Parameter (.mdp) file contains all information about the Molecular Dynamics simulation itself e.g. time-step, number of steps, temperature, pressure etc. The easiest way of handling such a file is by adapting a sample .mdp file. A sample mdp file can be found online.

Index file (.ndx)

Sometimes you may need an index file to specify actions on groups of atoms (e.g. Temperature coupling, accelerations, freezing). Usually the default ibdex groups will be sufficient, so for this demo we will not consider the use of index files.

Run input file (.tpr)

The next step is to combine the molecular structure (.gro file), topology (.top file) MD-parameters (.mdp file) and (optionally) the index file (ndx) to generate a run input file (.tpr extension or .tpb if you don't have XDR). This file contains all information needed to start a simulation with GROMACS. The grompp program processes all input files and generates the run input .tpr file.

Trajectory file (.trr)

Once the run input file is available, we can start the simulation. The program which starts the simulation is called mdrun. The only input file of mdrun you usually need to start a run is the run input file (.tpr file). The output files of mdrun are the trajectory file (.trr file or .trj if you don't have XDR) and a logfile ( .log file).




Water

Now you are going to simulate 216 molecules of SPC water (Berendsen et al., 1981) in a rectangular box. In this example the GROMACS software team already generated most of the neccesary input files. The files needed in this example are:

Change your directory to tutor/water :

cd tutor/water

Let's first have a look at the coordinate file:

more spc216.gro

Or to view the water box graphically:

rasmol spc216.pdb

Have a look at the topology file:

more water.top

Have a look at the MD-parameters file:

more water.mdp

Since all the neccesary files are available, we are going to, preprocess all the input files to create a run input (.tpr) file. This run input file is the only input file for the MD-program mdrun.

grompp -f water.mdp -p water.top -c spc216.gro -o water.tpr

The run input file is only viewable with the program gmxdump. In this way it is possible to check if the preprocessor grompp worked well.

gmxdump -s water.tpr | more

Now it's time to start to the simulation

mdrun -s water.tpr -o water.trr -c water_out.gro -v -g water.log

After the MD simulation is finished, it is possible to view the trajectory with the ngmx program:

ngmx -f water.trr -s water.tpr

When the program starts, you must select a group of atoms to view. In our case that will be "SOL" (for solvent) or "System", which is the same for a box of water as we have. Select one and click OK. Then select Display->Animate from the menu. Use the buttons to see your water moving (note: "Play" steps one frame forward; "Fast Forward" plays; "Rewind" skips back to the beginning of the trajectory).

Calculate a radial distribution function of the Oxygen atoms. The index file oxygen.ndx contains one group with all the oxygen atoms.

g_rdf -f water.trr -n oxygen.ndx -o rdf.xvg -s water.tpr

view the output graph of g_rdf. The file is already prepared to produce a nice graph in Grace (formerly Xmgr). Just type (xmgr works fine too)

xmgrace rdf.xvg

Which shows you the radial distribution function for Oxygen-Oxygen in SPC water.




Ribonuclease S-peptide

Ribonuclease A is a digestive enzyme, secreted by the pancreas. The enzyme can be cleaved by subtilisin at a single peptide bond to yield Ribonuclease-S, a catalytically active complex of an S-peptide moiety (residues 1-20) and an S-protein moiety (residues 21-124), bound together by multiple non-covalent links (Stryer, 1988).

The S-Peptide has been studied in many ways, experimentally as well as theoretically (simulation) because of the high a-helix content in solution, which is remarkable in such a small peptide.

All the files of speptide are stored in the directory tutor/speptide. First go to this directory:cd speptide

To be able to simulate the S-Peptide we need a starting structure. This can be taken from the protein data bank. There are a number of different structure for Ribonuclease S, from one of which we have cut out the first 20 residues, and stored it in speptide.pdb. Have a look at the file

more speptide.pdb

If you have access to a molecular graphics program such as rasmol, xmol, or a commercial package, you can look at the molecule on screen, eg:

rasmol speptide.pdb

The following steps have to be taken to perform a simulation of the peptide.

  1. Convert the pdb-file speptide.pdb to a GROMACS structure file and a GROMACS topology file.
  2. Solvate the peptide in water
  3. Perform an energy minimization of the peptide in solvent
  4. Add ions if necessary (we will omit this step here)
  5. Perform a short MD run with position restraints on the peptide
  6. Perform full MD without restraints
  7. Analysis

We will describe in detail how such a simulation can be done, starting from a pdb-file.

Generate a topology file (.top) from the pdb-file (.pdb)

Generate a molecular topology and a structure file in format. This can be done with the pdb2gmx program:

pdb2gmx -f speptide.pdb -p speptide.top -o speptide.gro

Note that the correct file extension are added automatically to the filenames on the command line. You will only be asked to choose a forcefield, choose 0, but you can also have pdb2gmx ask you about protonation of residues, and about protonation of N- and C-terminus. You can type

pdb2gmx -h

to see the available options.

The pdb2gmx program has generated a topology file speptide.top and a GROMACS structure file speptide.gro and it will generate hydrogen positions. The -p and -o options with he filenames are optional; without them the files topol.top and conf.gro will be generated. Now have a look at the output from pdb2gmx,

more speptide.gro

You will see a close resemblance to the pdb file, only the layout of the file is a bit different. Also do have a look at the topology

more speptide.top

You will see a large file containing the atom types, the physical bonds between atoms, etcetera.

Solvate the peptide in a periodic box filled with water

This is done using the programs editconf and genbox. editconf will make a rectangular box with empty space of user specified size around the molecule. genbox will read the structure file and fill the box with water.

editconf -f speptide -o -dc 0.5
genbox -cp out -cs -p speptide -o b4em

The program prints some lines of user information, like the volume of the box and the number of water molecules added to your peptide. genbox also changes the topology file speptide.top to include these water molecules in the topology. This can been seen by looking at the bottom of the speptide.top file

tail speptide.top

You will see some lines like

[ system ]
; Name		Number
Protein		1
SOL		N

where N is the number of water molecules added to your system by genbox.

It is also possible to solvate a peptide in another solvent such as dimethylsulfoxide (DMSO), as has been done by Mierke & Kessler, 1991.

Generate index file (.ndx extension)

By default, most GROMACS programs generate a set of index groups to select the most common subsets of atoms from your system (e.g. Protein, Backbone, C-alpha's, Solute, etc.). For the special cases when you need to select other groups than the default ones, an index file can be generated using make_ndx. This is an interactive program that lets you manipulate molecules, residues and atom. It's use should be self-explanatory. To invoke the program you would

make_ndx -f b4em

but don't bother for now.

Perform an energy minimization of the peptide in solvent

Now we have to perform an energy minimization of the structure to remove the local strain in the peptide (due to generation of hydrogen positions) and to remove bad Van der Waals contacts (particles that are too close). This can be done with the mdrun program which is the MD and EM program. Before we can use the mdrun program however, we have to preprocess the topology file ( speptide.top), the structure file ( speptide.gro) and a special parameter file (em.mdp). Check the contents of this file

more em.mdp

Preprocessing is done with the preprocessor called grompp. This reads up the files just mentioned:

grompp -v -f em -c b4em -o em -p speptide

In this command the -v option turns on verbose mode, which gives a little bit of clarifying info on what the program is doing. We now have made a run input file (em.tpr) which serves as input for the mdrun program. Now we can do the energy minimization:

mdrun -v -s em -o em -c after_em -g emlog

In this command the -v option turns on verbose mode again. The -o option sets the filename for the trajectory file, which is not very important in energy minimizations. The -c option sets the filename of the structure file after energy minimization. This file we will subsequently use as input for the MD run. The energy minimization takes some time, the amount depending on the CPU in your computer, the load of your computer, etc. The mdrun program is automatically niced; it runs at low priority. All programs that do extensive computations are automatically run at low priority. For most modern workstations this computation should be a matter of minutes. The minimization is finished when either the minimization has converged or a fixed number of steps has been performed. Since the system consists merely of water, a quick check on the potential energy should reveal whether the minimization was successful: the potential energy of 1 SPC water molecule at 300 K is -42 kJ mole-1. Since we have about 2.55e+03 SPC molecules the potential energy should be about -1.1e+5 kJ mol-1. If the potential energy after minimization is lower than -1.1e+05 kJ mol-1 it is acceptable and the structure can be used for MD calculations. After our EM calculation the program prints something like:

STEEPEST DESCENTS converged to 2000
  Potential Energy  = -1.19482e+05

which means our criterium is met, and we can proceed to the next step.

Perform a short MD run with position restraints on the peptide

Position restrained MD means Molecular Dynamics in which a part of the system is not allowed to move far off their starting positions. To be able to run with position restraints we must add a section to the speptide.top file, describing which atoms are to be restrained. Such a section is actually generated by the pdb2gmx program. In the topology file it looks like

#ifdef POSRES
#include "posres.itp"
#endif

In the topology file we use conditional inclusion, i.e. only if a variable POSRES is set in the preprocessor do we include the file, this allows us to use the same topology file for runs with and without position restraints. In the pr.mdp parameter file for the position restraints this variable is set indeed:

define              =  -DPOSRES

At last we can generate the input for the position restrained mdrun:

grompp -f pr -o pr -c after_em -r after_em -p speptide

Now it's MDrun time:

mdrun -v -s pr -e pr -o pr -c after_pr -g prlog >& pr.job &

This run is started in the background (it will take a while), you can watch how long it will take by typing:

tail -f pr.job

With the Ctrl-C key you can kill the tail command. A good check of your simulation is to see whether density and potential energies have converged:

g_energy -f pr -o out -w

The g_energy program will prompt you to select a number of energy terms from a list. For potential energy type:

9 0

If you have the xmgr program installed it will automatically pop up on your screen with the energy plot. You can do the same for the density and other energy terms, such as Solvent-Protein interactions.

Perform full MD without restraints

Full MD is very similar to the restrained MD as far as GROMACS is concerned. Check out the full.mdp for details.

grompp -v -f full -o full -c after_pr -p speptide

Then we can start mdrunning

mdrun -v -s full -e full -o full -c after_full -g flog >& full.job &

You should do similar convergence checks (and more!) as for the position restrained simulation.

Analysis

We will not describe analysis in detail, because most analysis tools are described in the Analysis chapter of the printed manual. We just list a few of the possibilities within GROMACS. By now you should be able to start programs yourself.

You have been witness of a full MD simulation starting from a pdb-file. It's that easy, but then again, maybe it was not that easy. The example presented here is a real example, this is how a production run should be performed, the complexity is in the process itself and not in the software (at least, that's our opinion).




Your own System

For proteins in water (or other solvent) the route is described above. For other systemd (eg. pure liquids or mixtures) one needs:




More Info

More info can be found in the flowchart (for a quick overview) and the GMX FAQ.




References

Berendsen, H.J.C., Postma, J.P.M., van Gunsteren, W.F., Hermans, J. (1981)
Intermolecular Forces, chapter Interaction models for water in relation to protein hydration, pp 331-342. Dordrecht: D. Reidel Publishing Company Dordrecht

Kabsch, W., Sander, C. (1983).
Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577--2637.

Mierke, D.F., Kessler, H. (1991).
Molecular dynamics with dimethyl sulfoxide as a solvent. Conformation of a cyclic hexapeptide. J. Am. Chem. Soc. 113, 9446.

Stryer, L. (1988).
Biochemistry vol. 1, p. 211. New York: Freeman, 3 edition.



http://www.gromacs.org