[IUCr Home Page]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

phase identifiers

Dear Colleagues,


               Discussion paper number 5

     The principal difficulty facing the phase identifier project
at the moment is how to identify the different isomers of organic
crystals that have the same sum formula.  This requires some way
of differentiating between the topologies of different isomers.
Organic chemists have devised elaborate procedures for describing
the topologies of organic molecules, but we do not need anything
as complex for several reasons.  Firstly, the chemical formula
will provide a first level of discrimination on the basis of the
numbers of different atoms present.  Secondly, we are not using
the phase identifier to locate a particular molecule or fragment
in the crystal, and thirdly, we are not interested in how many
distinct molecules or moieties the crystal may contain.  The only
test we are interested in is whether the topology of two
crystalline phases is the same or different.

     Sam Motherwell and I have discussed this problem and looked
at some of the methods used by the Cambridge Structural Database.
I have also tried out a number of simple descriptors.  The one
that seems to work best consists of a string of five numbers
showing how many carbon atoms in the formula unit of the crystal
(given by the sum formula) have 0, 1, 2, 3 and 4 attached
hydrogen atoms respectively.  There are very few organic
compounds where this spectrum of numbers cannot be uniquely
determined and, combined with the sum formula, it appears to
discriminate remarkably well.

     The five numbers that make up the spectrum need to be
separated, say, by a period since each number may run into more
than one digit.  The presentation can be simplified by omitting
all the numbers that are zero (their place is identified by the
periods), and since the only molecule or fragment that has four
attached hydrogen atoms is methane, the last number can also be
omitted (the presence of methane is indicated by a discrepancy
between the carbon content of the sum formula and the sum of the
numbers in the spectrum).  This results in a symbol that contains
three periods and up to four different numbers.

     The table below shows the full and abbreviated H spectra
around C for a number of simple organic molecules.  For
comparison I also give the spectra using the count of attached C
or attached C and H neighbours.

     The carbon spectrum will not distinguish optical isomers.  I
am not sufficiently familiar with this problem.  Does anyone have
any suggestions how optical isomers can be uniquely identified?

;  A sequence of four numbers separated by periods, indicating
the number of C atoms in the formula unit (defined by
_chemical_formula_sum) that have zero, one, two, and three
attached H atoms.  In the abbreviated form of this spectrum,
zeros are omitted.  Each symbol contains three periods and
between zero and four numbers whose total equals the carbon atom
count minus the number of methane molecules present.
                    Full Symbol
            C only     C+H           H       Abbrev  total C
methane  ...     1

ethane  ...2    2
ethylene  ..2.    2
acetylene  .2..    2
ethanol  ..1.1   2
    .1..1   2+Cl
    ..2.    2+Cl

propane  ..1.2   3

n-butane  ..2.2   4
i-butane  .1..3   4
cyclobutane  ..4.    4

pyridine  .5..    5+N
cyclopentane0.  ..5.    5
n-pentane  ..3.2   5
    .5..    5

benzene  .6..    6
cyclohexane  ..6.    6
n-hexane  ..4.2   6
phenol  1.5..   6+O

toluene  1.5..1  7

anthracene  2.8..  10

Other keys that we have identified are:
# Note that all keys are optional and not all are appropriate
# (carbon spectra only appear if there is C in the formula), but
# the more keys that can be given, the better the chance of
# identifying materials with the same phase.

_chemical_formula_sum (existing item)
# This item identifies the formula unit which is important if the
# carbon spectrum is given.
# For inorganic compounds, only the relative abundance of
# the elements is important since the choice of the size of
# the formula unit is not always obvious.

     _enumeration   gas    gas phase
                    liq    liquid phase
                    sol    solid phase of unknown form
                    xtl    crystalline solid
                    am     amorphous solid
                    lx     liquid crystal

_space_group_IT_number (existing item)
; The number of the space group from Int. Tables.

     _enumeration aP, mP, mE, oP, oE, oF, oI, tP, tI, hP, hR,
                                                   cP, cF, cI
# Redundant if the space group is known, but in any case useful
# in case comparison is being made with a material whose space
# group is not fully determined.  This symbol should correspond
# to the standard setting given in International Tables.

; The sequence of Wyckoff letters in alphabetical order
describing the occupied sites in the space group.  Where there is
a choice of sequences, the sequence lowest in the alphabetic
order should be used.  E.g., in P-1 the general position is i,
but any of the letters a - h could be chosen for an atom on a
center of symmetry, so a-d-i6 should be chosen rather than b-f-

The following keys are more problematic
;    The code used in the Pauling file.  This is useful for those
structure types which have been assigned a type code in the
Pauling file.
# Struktur Berichte has names for many structure types that have
# been often used in the condensed matter physics community.
# They have been around for many years and should be
# straightforward to apply.

_chemical_name_mineral (existing item)
; The name assigned by the International Mineralogical
Association(?) for natural minerals.  Should these names be used
for their synthetic analogues, given that the synthetic analogues
often have ideal compositions that are unknown in nature?

# This has been recommended as a good key for identifying
# identical materials, but it is subject to experimental
# uncertainty and so will rarely give an exact match.  We should
# consider including this as a secondary key that can be tested
# to see if two cells lie within prescribed tolerances.

The following keys identify the conditions under which the
material was characterized
_temperature (_diffrn_ambient_temperature)
; The temperature in K at which material was characterized.

_pressure  (_diffrn_ambient_pressure)
; The pressure in kPa at which material was characterized.

     Can you please post your comments to the phase-identifiers discussion
group (by replying to this email) by August 1.

               Best wishes


Dr.I.David Brown,  Professor Emeritus
Brockhouse Institute for Materials Research,
McMaster University, Hamilton, Ontario, Canada
Tel: 1-(905)-525-9140 ext 24710
Fax: 1-(905)-521-2773

Reply to: [list | sender only]

Copyright © International Union of Crystallography

IUCr Webmaster