5 Frequent errors during execution


5.1 What do errors like “forrtl: severe (59): list-directed I/O syntax error, unit 5, file stdin” mean?

“forrtl” = fortran run-time library (so it is a fortran error)
“list-directed I/O syntax error” = there was an error while reading or writing (see below)
“unit 5″ = fortran unit (5 is typically used to read input data)
“stdin” = standard input (i.e. terminal input, or redirection)
Typical case: you are reading data from terminal or from file like this: “code < data-file", and there is an error in what you typed or in "data-file". Sometimes it is not easy to spot the error (see also answer 5.9 below) but input parsing errors are almost invariably due to an error or an unexpected character in the input. An exception might be the case of parallel execution, see Item 4.4 in this FAQ.

5.2 Why is my job crashing with “segmentation fault”?

Possible reasons: too much memory requested; executable or mathematical libraries compiled for a different hardware; some incompatibility between compiler and mathematical libraries; flaky hardware; bug in compiler or in mathematical libraries. The latter two are typically not reproducible on different architectures or compilers; code bugs may sometimes be elusive, but typically yield a more reproducible pattern of problems. Segmentation faults in tests and examples almost invariably point to a problem in the compiler or in the mathematical libraries or in their interactions.

Mysterious, unpredictable, erratic errors in parallel execution are almost always coming from bugs in the compiler or/and in the MPI libraries and sometimes even from flaky hardware. Sorry, not our fault.

5.3 The code stops with a mysterious error in IOTK

IOTK is a toolkit that reads/writes XML files. There are frequent reports (especially when compling with gfortran and MKL libraries) of mysterious errors with IOTK not finding some variable in the XML data file. If this error has no obvious explanation (e.g. the file is properly written and read, the searched variable is present, etc) and if it appears to be erratic or irreproducible (e.g. it occurs only with version X of compiler Y), it is almost certainly due to a compiler bug. Try to reduce optimization level, or use a different compiler. If you paid real money for your compiler, complain with the vendor.

5.4 The code stops with an “error in davcio”

davcio is a routine that reads from/writes to disk. The error number is what the I/O operation returns, so it means little more than “there was an error”. Possible reasons: disk is full; outdir is not writable for any reason; you run post-processing codes on a number of processors/pools that are not the same used to produce the pw.x data (and did not set variable wf_collect); you made a mess with your data files and directories; your data files are corrupted; you were running more than one instance of pw.x in the same temporary directory with the same file names.

5.5 Why is the code saying “Wrong atomic coordinates”?

Because they are: two or more atoms in the list of atoms have overlapping, or anyway too close, positions. Can’t you see why? look better (or use a molecular viewer like XCrySDen) and remember that the code checks periodic images as well.

5.6 The code stops with a “wrong charge” error

Typically, you are treating a metallic system as if it were insulating. Use a gaussian smearing.

5.7 The code stops with an “error in cdiaghg” or “in rdiaghg”

This is a tough case. It signals that the Hamiltonian, or the overlap matrix, calculated in the subspace of occupied + correction states (used in iterative diagonalization), is singular. This should however never happen, unless: 1) the atomic positions are seriously wrong (e.g. too close), or 2) the pseudopotentials are bad, or not so good. The latter case typically happens with Ultrasoft PP. When the error is erratic and irreproducible on other machines, it may be related to mathematical libraries of questionable accuracy. If you are out of ideas, try option “diagonalization=’cg’ “.

5.8 The code stops with an error in routine “scale_h”

During a variable-cell structural optimization (“vc-relax”) you may get the following error:
_
Error in routine scale_h (1):
Not enough space allocated for radial FFT: try restarting with a larger cellfactor.

This is a consequence of a too small starting unit cell. If the cell expands too much, the number of plane waves and of G-vectors increases and may eventually exceed the length of arrays allocated at the beginning. Increase the value of optional variable “cell_factor”, or restart from a larger cell.

5.9 The code stops with _error reading namelist XXX_

Misspelled variable in namelist XXX, or properly-spelled variable set to an illegal value (e.g. an integer variable to a real value). Also: unexpected characters, such as DOS CR-LF characters, ‘curly’ quotes instead of ‘straight’ quotes, in the file, or comments introduced by “#” (only “!” is allowed inside namelists). Also: if the input file is empty, you get an error while reading the first namelist, typically “&control”. For the parallel case, see items 4.3 and 4.4 of these FAQs.