2. SHELXL - Structure Refinement
SHELXL is a program for the refinement of crystal
structures from diffraction data, and is primarily intended for single
crystal X-ray data of small moiety structures, though it can also be used
for refinement of macromolecules against data to about 2.5 Å or better.
It uses a conventional structure factor summation, so it is much slower
(but a little more accurate) than standard FFT-based macromolecular programs.
SHELXL is intended to be easy to install and use. It is very general, and
is valid for all space groups and types of structure. Polar axis restraints
and special position constraints are generated automatically. The program
can handle twinning, complex disorder, absolute structure determination,
CIF and PDB output, and provides a large variety of restraints and constraints
for the control of difficult refinements. An interface program SHELXPRO
enables macromolecular refinement results to be displayed in the form of
Postscript plots, and generates map and other files for communication with
widely used macromolecular programs. An auxiliary program CIFTAB is useful
for tabulating the refinement results via the CIF output file for small
molecules.
(F2)
in standard SHELX format (section 2.3); the program merges equivalents
and eliminates systematic absences; the order of the reflections in this
file is unimportant. Crystal data, refinement instructions and atom coordinates
are all input as the file name.ins; further files may be specified
as 'include files' in the .ins file, e.g. for standard restraints,
but this is not essential. Instructions appear in the .ins file
as four-letter keywords followed by atom names, numbers, etc. in free format;
examples are given in the following chapters. There are sensible default
values for almost all numerical parameters. SHELXL is normally run on any
computer system by means of the command:
shelxl name
where name defines the first component of the filename for all files which correspond to a particular crystal structure. On some systems, name may not be longer than 8 characters. Batch operation will normally require the use of a short batch file containing the above command etc. The executable program must be accessible via the 'PATH' (or equivalent mechanism). No environment variables or extra files are required.
A brief summary of the progress of the structure
refinement appears on the console, and a full listing is written to a file
name.lst,
which can be printed or examined with a text editor. After each refinement
cycle a file name.res is (re)written; it is similar to name.ins,
but has updated values for all refined parameters. It may be copied or
edited to name.ins for the next refinement run. The MORE instruction
controls the amount of information sent to the .lst file; normally
the default MORE 1 is suitable, but MORE 3 should be used if extensive
diagnostic information is required. The ACTA instruction produces CIF format
files for archiving or electronic publication, and the LIST 4 instruction
(generated automatically by ACTA) produces a CIF format reflection data
file (name.fcf). For PDB deposition of macromolecular results, WPDB
and LIST 6 should be used. The program SHELXPRO should then be used to
complete the PDB file.
Two mechanisms are provided for interaction with a SHELXL job which is already running. The first is used by the MSDOS and some other 'on-line' versions: if the <ctrl-I> key combination is hit, the job terminates almost immediately, but without the loss of output buffers etc. which can happen with <ctrl-C> etc. Usually the <Tab> key may be used as an alternative to <ctrl-I>. If the <Esc> key is hit during least-squares refinement, the program completes the current cycle and then, instead of further refinement cycles, continues with the final structure-factor calculation, tables and Fourier etc. Otherwise <Esc> has no effect. On computer consoles with no <Esc> key, <F11> or <Ctrl-[> usually have the same effect.
The second mechanism requires the user to create
the file name.fin (the contents of this file are irrelevant); the
program tries at regular intervals to delete it, and if it succeeds it
takes the same action as after <Esc>. The name.fin file is also
deleted (if found) at the start of a job in case it has been accidentally
left over from a previous job. This approach may be used with batch jobs
under most operating systems.
A number of instructions allow atom names to be referenced;
use of such instructions without any atom names means 'all non-hydrogen
atoms' (in the current residue, if one has been defined). A list of atom
names may also be abbreviated to the first atom, the symbol '>' (separated
by spaces), and then the last atom; this means 'all atoms between and including
the two named atoms but excluding hydrogens'.
(Fo2),
and (optionally) a batch number. This file should be terminated by a record
with all items zero; individual data sets within the file should NOT be
separated from one another - the batch numbers serve to distinguish between
groups of reflections for which separate scale factors are to be refined
(see the BASF instruction). The reflection order and the batch number order
are unimportant. This '.hkl' file is read each time the program
is run; unlike SHELX-76, there is no facility for intermediate storage
of binary data. This enhances computer independence and eliminates several
possible sources of confusion. The .hkl file is read when the HKLF
instruction (which terminates the .ins file) is encountered. The
HKLF instruction specifies the format of the .hkl file, and allows
scale factors and a reorientation matrix to be applied. Lorentz, polarization
and absorption corrections are assumed to have been applied to the data
in the .hkl file. Note that there are special extensions to the
.hkl
format for Laue and powder data, as well as for twinned crystals that cannot
be handled by a TWIN instruction alone.
In general the .hkl file should contain all
measured reflections without rejection of systematic absences or merging
of equivalents. The systematic absences and Rint for
equivalents provide an excellent check on the space group assignment and
consistency of the input data. Since complex scattering factors are used
throughout by SHELXL, Friedel opposites should normally not be averaged
in preparing this file; an exception can be made for macromolecules without
significant anomalous scatterers. Note that SHELXS always merges Friedel
opposites.
(F)].
More experimental information is incorporated (suitably weighted) and the
chance of getting stuck in a local minimum is reduced. In pseudo-symmetry
cases it is very often the weak reflections that can discriminate between
alternative potential solutions. It is difficult to refine against ALL
F-values
because of the difficulty of estimating
(F) from
(F2)
when F2 is zero or (as a result of experimental error)
negative.
The diffraction experiment measures intensities and
their standard deviations, which after the various corrections give Fo2
and
(Fo2).
If your data reduction program only outputs Fo and s
(Fo),
you should correct your data reduction program, not simply write a routine
to square the Fo values ! It is also legal to use HKLF
3 to input Fo and
(Fo)
to SHELXL. Note that if an Fo2 value is too
large to fit format F8.2, then format F8.0 may be used instead. - the decimal
point overrides the FORTRAN format specification.
The use of a threshold for ignoring weak reflections may introduce bias
which primarily affects the atomic displacement parameters; it is only
justified to speed up the early stages of refinement. In the final refinement
ALL DATA should be used except for reflections known to suffer from systematic
error (i.e. in the final refinement the OMIT instruction may be used to
omit specific reflections - although not without good reason - but not
ALL reflections below a given threshold). Anyone planning to ignore this
advice should read Hirshfeld & Rabinovich (1973) and Arnberg, Hovmöller
& Westman (1979) first. Refinement against F2 also
facilitates the treatment of twinned and powder data, and the determination
of absolute structure.
Rint =
| Fo2 - Fo2(mean)
| /
[ Fo2
]
where both summations involve all input reflections for which more than one symmetry equivalent is averaged, and:
Rsigma =
[
(Fo2)
] /
[ Fo2
]
over all reflections in the merged list. Since these R-indices
are based on F2, they will tend to be about twice as
large as the corresponding indices based on F. The 'esd of the mean'
(in the table of inconsistent equivalents) is the rms deviation from the
mean divided by the square root of (n-1), where n equivalents
are combined for a given reflection. In estimating the
(F2)
of a merged reflection, the program uses the value obtained by combining
the
(F2) values
of the individual contributors, unless the esd of the mean is larger, in
which case it is used instead.
For some refinements of twinned crystals, and for least-squares refinement
of batch scale factors, it is necessary to suppress the merging of equivalent
reflections with MERG 0.
For both L.S. and CGLS options, it is possible to block the refinement
so that a different combination of parameters is refined each cycle. For
example after a large structure has been refined using CGLS (without BLOC),
a final job may be run with L.S. 1, DAMP 0 0 and BLOC 1 (or e.g. BLOC N_1
> LAST for a protein) to obtain esds on all geometric parameters; the anisotropic
displacement parameters are held fixed, reducing the number of parameters
by a factor of three and the cycle time by an order of magnitude.
(Fo)
is also printed.
wR2 = {
[ w(Fo2-Fc2)2
]
/
[ w(Fo2)2
]
}1/2
R1 =
| |Fo|
- |Fc| | /
|Fo|
The Goodness of Fit is always based on F2:
GooF = S = {
[
w(Fo2-Fc2)2
] / (n-p) }1/2
where n is the number of reflections and p is the total number of parameters refined.
The WGHT instruction allows considerable flexibility, but in practice it is a good idea to leave the weights at the default setting (WGHT 0.1) until the refinement is essentially complete, and then to use the scheme recommended by the program. These parameters should give a flat analysis of variance and a GooF close to unity [there was a bug in SHELXL-93 that can occasionally cause the program to abort when trying to estimate the new weighting parameters, though it appeared to happen only with poor quality data or the wrong solution]. If the weights are varied too soon, the convergence may be impaired, because features such as missing atoms are 'weighted down'. For macromolecules it may be advisable to leave the weights at the default settings; and to accept a GooF greater than one as an admission of inadequacies in the model.
When not more than two WGHT parameters are specified, the weighting scheme simplifies to:
w = 1 / [
2(Fo2)
+ (aP)2 + bP ]
where P is [ 2Fc2 + Max(Fo2,0) ] / 3. The use of this combination of Fo2 and Fc2 was shown by Wilson (1976) to reduce statistical bias.
It may be desirable to use a scheme that does not
give a flat analysis of variance to emphasize particular features in the
refinement, for example by weighting up the high angle data to remove bias
caused by bonding electron density (Dunitz & Seiler, 1973).
(Fo)
are down-weighted in the Fourier synthesis. The rms density is calculated
to give an estimate of the 'noise level' of the map.
The HTAB instruction has been introduced in SHELXL-97 to analyze the hydrogen bonding in the structure. A search is made over all hydrogen atoms to find possible hydrogen bonds. This is a convenient way of finding the symmetry operations necessary for the second form of HTAB instructions (needed to obtain esds and CIF output), and also reveals potential misplaced hydrogens, e.g. because they do not make any hydrogen bonds, or because the automatic placing of hydrogen atoms has assigned the hydrogens of two different O-H or N-H groups to the same hydrogen bond. In the second form of the HTAB instruction, HTAB is followed by the names of the donor atom D and the acceptor atom A; for the latter a symmetry operation may also be specified. The program then finds the most suitable hydrogen atom to form the hydrogen bond D-H***A, and outputs the geometric data for this hydrogen bond to the .lst file and the .cif file (if ACTA is present).
Chapter 1. General Introduction to SHELX-97
Chapter 3. Examples of Small Molecule Refinements with SHELXL