GROMACS FAQ.

GROMACS Frequently Answered Questions.

Since this FAQ is written for version 2.0, not all of the solutions may apply to previous GROMACS versions.

Download & Installing: I want GROMACS.; I might want GROMACS, but I don't know if my system is supported.; I want to compile and install GROMACS.; I want to run GROMACS multi-processor.; I want to run GROMACS faster.; I want to simulate a very large system using GROMACS.
Preparation: I have a PDB file.; I have a protein with multiple subunits.; I want to convert my structure from a .gro, .tpr or trajectory file to a .pdb file.; pdb2gmx is complaining about long bonds and/or missing atoms.
Simulation: mdrun says my run will end somewhere in (or after) the year 2000.; mdrun says: "1-4 (#,#) interaction not within cut-off".; My run seems to be running, but no output is produced.; I get very strange temperatures.; I get very strange temperatures during my simulated annealing run.; My run crashed and I want to continue it.
Analysis: I have a PDB file with multiple entries which I want to analyze.; I want to fit two structures which do not have the same number/sequence of atoms.; I get tired of having to select the same index group over and over again.; I want to do an analysis GROMACS doesn't have a program for.
Other: My problem is none of the above or none of the solutions below seem to work.

Go to the software page and read it. You should also fill in the license agreement and send it to us. For academic institutions the license is free. We will give you a username and password so you can download GROMACS.

(More info can be found in the download page.)

I might want GROMACS, but I don't know if my system is supported. Now what?

GROMACS is short for GROMACS Runs On Most Of All Computer Systems. For all unix dialects no problems should exist. Especially SUN (SunOS), SGI (IRIX) and LINUX are known to be virtually problem-free. It is even possible to use GROMACS on a MS Windows (pref. 95/98 or NT) system (you will need CygWin). The basic requirements are a version of GNU make (usually called gmake) and a C/C++ compiler. For better performance a good FORTRAN compiler should be available too.

Systems for which a default Makefile is present are listed below. For all these systems GROMACS normally compiles without any problems, and can also run in parallel (providing a good version of MPI or PVM is present).

alx: Alpha/Linux
cm5: CM5
dec: Alpha/Dec (Tru64 unix)
hp: HP/PaRISC
lnx: Linux/x86
pgi: Linux/x86 w Portland Compiler
s10: SGI/R10000/R12000
s5k: SGI/R5000
s8k: SGI/R8000
sgi: SGI/R4000
smp: IBM SMP nodes
sol: Sun/SOLARIS
sp2: IBM/SP2
spp: Convex SPP
sun: Sun/SunOS
ult: Sun Ultrasparc
win: Windows9x/NT
ymp: Cray/YMP

(More info can be found in the download page.)

I want to compile and install GROMACS. Now what?

Check out the download page for instructions on installing and compiling GROMACS and some hints for when things go wrong. If all else fails, please don't hesitate to contact us. Also if you have experienced a problem and found a solution, please let us know so other people won't have to do the work you already did all over again.

I want to run GROMACS multi-processor. Now what?

You will need an up to date version of MPI or PVM. Set USE_MPI or USE_PVM to yes in your Makefile.$GMXCPU and type:

cd $GMXHOME/src
gmake clean
gmake

On most architectures you will now have new binaries in the $GMXHOME/bin/$GMXCPU directory. The mdrun in that directory will run both in parallel and non-parallel. On some architectures (e.g. IBM SP) you will have a parallel mdrun binary in $GMXHOME/bin/$GMXCPU_mpi (or _pvm).

The extra things needed to start a run on multiple processors are: you have to specify the -np <nr> option with grompp and you need to start mdrun with the appropriate options. For instance on an SGI with the newest verison of MPI:

grompp -np <nr> -o topol.tpr <other options>
mpirun -np <nr> <dir>/mdrun -s topol.tpr <other options>

NOTE: you have to specify the full path of the MPI mdrun binary. The newest version of MPI does not require the -np option after mdrun.

Newer MPI versions on SGI's can officially not run in the background. To prevent mdrun from hanging when you log out, use the following csh script:

#!/bin/csh -f
mpirun -np <nr> <dir>/mdrun -s topol.tpr > & log &
exit

I want to run GROMACS faster. Now what?

First rest assured that GROMACS is amongst the fastest MD programs currently available. We are often a factor of 2 faster than some of the other leading packages.

Ultimately, one would want to simulate seconds of whole ensembles of proteins. Since that is not nearly possible (we only come 9 orders of magnitude short in time and 18 orders of magnitude short in number of particles), we suggest the following things to at least get the maximum performance currently available:

Turn off double precision by selecting a non-double precision CPU type when running gmx_conf. This does not have a negative effect on the outcome of your simulations, the accuracy is normally not restricted by the precision of the floating point notation, but by the numerical integration scheme itself. This is a free-for-all performance gain of up to 50% (On our SGI Power Challenge, the double precision version runs some 30% slower than single precision.)

Use Fortran inner loops (providing you have a good Fortran compiler available for your system). If you have a DEC alpha running Linux, get the Compaq compilers (see our benchmark page).
If your system consists partly of water, use water specific non-bonded optimizations (set solvent_optimization = SOL in your .mdp file. This will speed up your simulation by about a factor of 2, depending on your system and your machine type.
Run GROMACS parallel on multiple processors (see ``I want to run GROMACS multi-processor''). Although this increases overall performance, effiency may be lost due to sub-linear scaling of the parallel system, e.g. 2 processors perform at 85% of twice the performance of one processor. This gets progressively worse for more processors, for more than 8 processors efficiency drops to about 60%. Because of this, it is often better to run multiple runs on one or a few processors. Check out or benchmark page for some real-life examples on both single- and multi-processor systems.
Preliminary tests show that PME and PPPM are somewhat faster than simulating with a twin range cut-off of 1.0/1.7 nm on some systems. Unfortunately PME and PPPM do not scale as well as cut-off electrostatics when running in parallel.
We have a new option -dummies for pdb2gmx in GROMACS 2.0 which controls the constraining of hydrogen atoms. This eliminates the highest freqency motions in your system, enabling you to increase the timestep without loss of accuracy to about 4 fs, or even up to 7 fs with negligible loss of accuracy!
If your system has relatively slow disk-IO, and/or you write frames and energies out very often, and/or you have a very large system the performance might be limited by disk access. In that case, you might consider writing less frames to your trajectories (.xtc and especially .trr or .trj) and energy file (.ene or .edr).
And last, if you do not yet have the fastest computer available, get it! Or if you do have a very fast computer already, get another one or two.

Also see our benchmark page for more details on GROMACS performance. Also be sure to check it out in case you are still considering what type of machine you should buy to perform your simulations.

I want to simulate a very large system using GROMACS. Now what?

Just do it. The limit on the number of atoms in the calculation is only determined by the amount of memory available in your computer system. As an indictaion: a system of 12000 atoms takes about 10Mb of memory, and 6000 atoms about 5.5Mb (on a SGI O2 machine with GROMACS 1.6), which comes down to just over 900 bytes memory use per atom in your system (your mileage may vary). Due to the fact that we initially developed GROMACS to run on our home-built parallel machine, with only 8Mb of memory per processor, the code is quite well optimized for memory use. To get an indication of scaling of GROMACS performance as a function of system size, have a look at the benchmark page.

I have a PDB file. Now what?

Look at the flowchart for a quick overview. Start where it says "eiwit.pdb" (this is somewhere at the top). More detailed info can be found in the Getting Started section, you probably can start where it says "Ribonuclease S-Peptide".

I have a protein with multiple subunits. Now what?

pdb2gmx can automatically process multimeric proteins, but won't be able to make inter-subunit cystine bridges. pdb2gmx will only recognize the units as different chains if they have different chain identifiers.

I want to convert my structure from a .gro, .tpr or trajectory file to a .pdb file. Now what?

Any generic structure file, for instance .gro, .pdb or .tpr, can be converted to .pdb with editconf. You can view a .pdb file with for instance rasmol Two generic structure files can be fitted with g_confrms, the two superimposed structures can be written to a .pdb file. Any generic trajectory format can be converted with trjconv. You can dump one frame with trjconv -dump, or write a .pdb with multiple frames using trjconv -op -app. If multiple structures in a .pdb are separated by ENDMDL keywords, you should use rasmol -nmrpdb to view them.

pdb2gmx is complaining about long bonds and/or missing atoms. Now what?

There are probably atoms missing earlier in the .pdb file which makes pdb2gmx go crazy. Check the screen output of pdb2gmx, as it will tell you which one is missing. Then add the atoms in your pdb file, energy minimization will put them in the right place, or fix the side chain with e.g. the WhatIF program.

mdrun says my run will end somewhere in (or after) the year 2000. Now what?

This could mean that you have started a very long simulation, or you are using a very slow computer. You might want to check if your simulations really should be as long as it is, or get a faster computer. Either way, there is nothing to worry about because GROMACS is fully 2000 compliant. It might also mean that your simulation somehow caused Not-a-Numbers to be generated (check the energies in your logfile). In that case look at ``My run seems to be running, but no output is produced''.

mdrun says: "1-4 (#,#) interaction not within cut-off". Now what?

Most importantly: do not increase your cut-off! This error actually indicates that some atoms have gotten a very large velocity, which usually means that (part of) your molecule(s) is (are) exploding. If you are using LINCS for constraints, you probably also already got a number of LINCS warnings. When using SHAKE this will give rise to a SHAKE error, which halts your simulation before the "1-4 not within cutoff" error can appear.

There can be a number of reasons for the large velocities in your system. If it happens at the beginning of the simulation, your system might be not equilibrated well enough (e.g. it contains some bad contacts). Try a(nother) round of energy minimization to fix this. Otherwise you might have a very high temperature (which might be caused by incorrect use of simulated annealing), and/or a too large timestep. Experiment with these parameters till the error stops occurring.

My run seems to be running, but no output is produced. Now what?

Your simulation might simply be (very) slow, and since output is buffered, it can take quite some time for output to appear in the respective files. If you are trying to fix some problems and you want to get output as fast as possible, you can set the environment variable LOG_BUFS to 0 by using setenv LOG_BUFS 0, this disables output buffering. Use unsetenv LOG_BUFS to turn buffering on again.
Something might be going wrong in your simulation, causing e.g. not-a-numbers (NAN) to be generated (these are the result of e.g. division by zero). Subsequent calculations with NAN's will generate floating point exceptions which slow everything down by orders of magnitude. On a SGI system this will usually result in a large percentage of CPU time being devoted to 'system' (check it with osview, or for a multi-processor machine with top and osview).
You might have all nst* parameters (see your .mdp file) set to 0, this will suppress most output.
Your disk might be full. Eventually this will lead to mdrun crashing, but since output is buffered, it might take a while for mdrun to realize it can't write.
If you are running with MPI on an SGI you need a script, see I want to run GROMACS multi-processor.

I get very strange temperatures. Now what?

If you are using simulated annealing (see .mdp options), check out ``I get very strange temperatures during my simulated annealing run''. If you are not using simulated annealing, you might have very close contacts or a too large time step. This causes inaccurate integration which will usually result in a large positive temperature drift. Try some more energy minimization to get rid of the close contacts, or if that still doesn't help, try a short equilibration run with a smal(ler) time step.

I get very strange temperatures during my simulated annealing run. Now what?

This probably means that the initial time in your simulation is not zero ps. The temperature during simulated annealing is controlled by two points on a linear curve: temperature will be zero K at zero_temp_time ps and it will be ref_t K at zero ps. This means that if you do not start your simulation at zero ps, the simulation will not start at ref_t K. So if your zero_temp_time is positive and the starting time for your simulation is larger than that, the temperature in your simulation will be zero. If your zero_temp_time is negative, temperature will rise indefenitely with increasing time. Below two schematic graphs are included to clarify things a bit.

System temperature v.s. Time in simulated annealing:
zero_temp_time < 0 : zero_temp_time > 0 :
Heating to inf. Cooling to zero

My run crashed and I want to continue it. Now what?

The following answer is only valid for crashes which are not due to mdrun, like a system crash, a full disk, or a kill by the queuing system. Otherwise use grompp.

To really continue a simulation as if nothing had happened, you will need coordinates and velocities in full precision (i.e. .trj format). .xtc trajectories are in reduced precision (only 3 decimal places after the decimal point) and do not contain velocity information at all. Feed this trajectory and your origional .tpr file to tpbconv to obtain a new .tpr file, be sure to specify the one-but-last frame from your .trj file, since the very last frame is likely to be corrupted due to the crash. With the .tpr file tpbconv produces you can restart your simulation.

After the continuation run is finished, you will have your simulation split up in separate files, which you will probably want to combine. This can be done as follows (the same command works for xtc-files):

trjcat -o whole.trr part1.trr part2.trr part3.trr

The energy files can be concatenated in a similar manner:

eneconv -o whole.edr part1.edr part2.edr part3.edr

Since tpbconv sets the time in the continuation runs the files are automatically sorted and overlapping frames removed. If you have a mix of runs continued with tpbconv and grompp you might have to set the times yourself (see the manual pages for details).

It is ofcourse possible to start a simulation from the coordinates in your xtc file, but in that case new velocities will have to be generated resulting in a 'kink' in the simulation. To prevent this you should write coordinates and velocities to a .trj file during your simulations. Do this by setting nstxout and nstvout in your .mdp file. You don't need these frames very often (every 10 ps or so), but remember that when mdrun crashes, everything calculated after the last frame in the .trj file, will have to be recalculated for a proper continuation.

I have a PDB file with multiple entries which I want to analyze. Now what?

NOTE: this is not about multi-subunit proteins.

I assume your .pdb file is called "eiwit.pdb". Then this is what you would do:

pdb2gmx -f eiwit.pdb -reth -ter -n

-reth lets pdb2gmx keep all hydrogens which are present in your input file. It will also not add any missing hydrogens, so your molecules should be complete. -ter will cause pdb2gmx to ask for termini types for which you must select 'none' for both C- and N-terminus. -n tells pdb2gmx to generate a .ndx file with the atoms reordered to the GROMACS standard. pdb2gmx now generates a topology file (topol.top) which exactly corresponds with the molecule(s) in your input file. It also writes a coordinate file (conf.gro).

The next step is:

trjconv -f eiwit.pdb -o eiwit.xtc -n clean -timestep 1 -box 10 -center

Yes, -f eiwit.pdb works because a .pdb is also a trajectory format in GROMACS. -ox sets output to .xtc. -n clean tells trjconv to use the clean.ndx generated by pdb2gmx, so the atom ordering in the output (.xtc) file will be according to GROMACS standards. -timestep 1 sets the timestep between output frames to one, so the structures from the .pdb file get numbered sequentially. -ter causes TER markers in the .pdb file to be seen as end-of-frame, default ENDMDL is used. If you are not sure what is in your eiwit.pdb, TER is a good guess, but you should check. If you have ENDMDL in stead of TER, omit the -ter. -box 10 sets a default box-size in the output .xtc trajectory (since no box is stored in a .pdb file). The size is in nm and should be larger than your molecule size. -center resets the geometrical center of each of your structures to the center of the box (the one you specify with -box). trjconv will generate a .xtc trajectory file with all the coordinates from your eiwit.pdb.

A not very exiting but obligatory step is:

grompp -f grompp.mdp -c conf.gro -p topol.top

This will generate a run input file (topol.tpr) from the topol.top and conf.gro you generated with pdb2gmx. A default grompp.mdp is available. You can probably use it 'as is', but you might want or need to modify some thing. In any case you are encouraged to review the description of the numerous options in the .mdp file.

Now, suppose you want to calculate all cross-rmsd values for all structures. You will do:

g_rms -f eiwit.xtc -s topol.tpr -m

-f eiwit.xtc and -s topol.tpr are self-explanatory. -m tells g_rms to output an RMSD matrix in .xpm format, which can be directly viewed with for example xv.

Of course there are many more analysis tools available. For example ngmx a trajectory viewer. A list of all tools is available in the online manual.

I want to fit two structures which do not have the same number/sequence of atoms. Now what?

Just type:

g_confrms -f1 file1.xxx -f2 file2.xxx

g_confrms accepts generic structure format which can be for instance .pdb, .gro or .tpr. The program will ask you to select subgroups of both structures for the (non mass weighted) LSQ fit. These subgroups must have the same number of atoms, however the two structures do not need to have the same number of atoms, i.e. two proteins with the same number of residues but not the same type of residues can be fitted on c-alpha's. You will be warned when the atomnames in the fit groups do not match, but the program will go on. Option -o gives a .gro file of the second structure fitted one the first. Option -op gives a .gro file of the two structures fitted on top of each other.

I get tired of having to select the same index group over and over again. Now what?

Use make_ndx to create an .ndx file with only one group in it, this is done by typing 'keep #' in make_ndx, where '#' stands for the one group you want to have. Name the file index.ndx (which is the default filename for index files) and specify the option -n with your favorite GROMACS analysis tool. Now this one group will get selected automatically every time an index group is needed.

I want to do an analysis GROMACS doesn't have a program for. Now what?

Try the following minimal C program:

#include "sysstuff.h"
#include "macros.h"
#include "statutil.h"
#include "smalloc.h"
#include "copyrite.h"
#include "gstat.h"

void main(int argc,char *argv[])
{
  t_topology *top;
  int      status;
  int      natoms;
  real     t;
  rvec     *x;
  matrix   box;

  static char *desc[] = {
    "In the future, this program might perform your analysis."
  };
  
  static char *bugs[] = {
    "This program does not do anything."
  };
  
  static bool bPBC=TRUE;
  static int  nr=1;
  static real frac=0.5;
  t_pargs pa[] = {
    { "-pbc", FALSE, etBOOL, &bPBC,  "Make molecules whole again" },
    { "-nr",  FALSE, etINT,  &nr,    "Number" },
    { "-fr",  FALSE, etREAL, &frac,  "Fraction" }
  };

  t_filenm fnm[] = {
    { efTPX, NULL,  NULL,  ffREAD },
    { efTRX, NULL,  NULL,  ffREAD }
  };
#define NFILE asize(fnm)

  CopyRight(stderr,argv[0]);
  parse_common_args(&argc,argv,PCA_CAN_TIME,TRUE,
		    NFILE,fnm,asize(pa),pa,asize(desc),desc,asize(bugs),bugs);
  
  /* read topology (atomnames, bonds etc.):                 */
  /* (if you don't need those you can omit this step)       */
  top=read_top(ftp2fn(efTPX,NFILE,fnm));

  /* initialize reading trajectory:                         */
  natoms=read_first_x(&status,ftp2fn(efTRX,NFILE,fnm),&t,&x,box);

  /* check topology against trajectory:                     */
  /* your trajectory can be smaller than your topology,     */
  /* if you omitted molecules at the end of your topology,  */   
  /* if you did not read a topology, omit this also         */
  if (natoms > top->atoms.nr)
    fatal_error(0,"Topology (%d atoms) does not match trajectory (%d atoms)",
		top->atoms.nr,natoms);

  /* start analysis of trajectory */
  do {
  if (bPBC)
    /* make molecules whole again */
    rm_pbc(top->idef,natoms,box,x,x);

      /**************************/
      /*                        */
      /* PUT YOUR ANALYSIS HERE */
      /*                        */
      /**************************/

  }  while (read_next_x(status,&t,natoms,x,box));

  /* clean up */
  sfree(x);
  close_trj(status);

}

Note that this has to be edited before it will do anything usefull. You have to be sure all the GROMACS include files and library are available when compiling this. The easiest way to do this is to make a new file (e.g. g_myanal.c) in the src/tools directory and also add this to the Makefile in src/tools (look how other programs are in there and copy that). After that typing

% gmake g_myanal

should do the trick. If you don't want to clutter up the src/tools directory, you can also make a new directory in src (e.g. local or special), copy the src/tools/Makefile there and modify it. If you want more fancy options in your program, try modifying one of the smaller GROMACS analysis tools. g_com.c is a good one to start from although you might want to take one which already does something similar to what you want to do.

Finally, if you wrote an analysis tool which, to your opinion, adds something that is really missing in GROMACS, please tell us so we, and all other GROMACS users, can also benefit from it.

My problem is none of the above or none of the solutions seem to work. Now what?

Look at the download page to see if your problem (and/or a possible solution) is mentioned there. Be sure to check the problems section on the download page. Also try "Getting Started" where a guided tour of GROMACS is provided. A quick glance at the flowchart will tell you if you missed any essential steps in setting up a run. Also checking your .mdp file against our sample .mdp file and the mdp options list will solve a number of potential problems. In general it never hurts to read the manual pages of all the GROMACS programs you (tried to) use. If all this still leaves you with any unanswered questions, please don't hesitate to contact us.

For summary, the pages to check:

The download page (esp. the problems section).
The "Getting Started" guided tour
The flowchart
Our sample .mdp file
The mdp options list
And finally the manual pages
Us

System temperature v.s. Time in simulated annealing:
`zero_temp_time < 0` :		`zero_temp_time > 0` :
Heating to inf.		Cooling to zero