[IUCr Home Page]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Towards the definition of a phase identifier

David Brown

     There have been some helpful (if not very numerous) comments
on the creation of an unambiguous phase identifier, and since the
flow of comments has stopped it is time to summarize and see
where we are heading.

     It is first useful to repeat the criteria a phase identifier
must obey and, just as importantly, to mention a few commonly
assumed restrictions that do not apply.

1. The identifier must be unambiguous.  This means that the
identifier must be the same whoever creates it.

2. It must be unique.  No two different phases should have the
same identifier.

3. The identifier may be composed of several components.  As
Brian points out, at this stage we need to define the components
but we do not need to worry about how to link them or the best
way of making them look pretty.  He recommends that we use a CIF-
like pattern to represent them since this will allow us to focus
on the suitability of individual components.  Thus in a CIF-like
format the Pearson symbol would be represented by:

_crystal_system    m
_crystal_centring  C
_atom_count        28

It is easy enough to see that this can be written as mC28 and we
can argue later whether this is the best format or, say, m28C or
m-C-28.  I use the Pearson symbol here as an example, not as a
recommendation, for the phase identifier.  Brian gives an example
of a chemical identifier given in XML, but the flexibility built
into XML leads to a verbosity that unfortunately obscures the
values being assigned.

4. Brian also points out that it is not necessary that all
components be given if one of them is not known, e.g.

_crystal_system    ?
_crystal_centring  C
_atom_count        28

could be assigned even if the crystal system were not known.  The
question mark, ?, in CIF indicates that the information is not
known or is not available.  A search using such a partial
identifier may not be unique, but the target would be among those
phases retrieved, and the assignment of such a symbol would
ensure that it would be retrieved by a search for mC28 as well as

5. While this system would work well for internal identifiers,
there is no reason why an external identifier (e.g. REFCODE or
CAS number) should not be incorporated in the same format.

6. As far as possible, the components that make up the identifier
should not be numbers that are subject to experimental
uncertainty since such an identifier would not be unambiguous
(criterion 1).   Integers (as in the Pearson symbol) are fine,
but quantities such as densities, for example, would need to be
handled very carefully because different people will assign
slightly different values.

     If we are willing to accept these criteria for the
construction of an identifier we should next turn our attention
to the fields that need to be defined.

7.1 Jean-Claude, supported by Sidney, states that the composition
is essential, but it is not clear how this is to be represented.
For organic (and some inorganic) crystals, the CAS number might

_CAS_number   137892

There might be problems in the way the CAS number is defined,
particularly if the same compound is listed under two different
CAS numbers (e.g. for different phases) since this might violate
criterion 1.

A sum formula would be needed if the formula is to be
unambiguous, but it must also be normalized to prevent
ambiguities such as

_composition_formula   O5 P2


_composition_formula   O10 P4

which refer to different perceptions of the same material.

One way would be to normalize the composition to the largest
component and list the components either alphabetically or in
order of decreasing composition (with an alphabetical order used
for atoms present in the same proportions).

_composition_formula   O1 P0.4

In this case we could decide to omit the value of 1 as it will
always be present for the most common element(s). This convention
would run into difficulties with numbers like 0.33333333.  So
perhaps one could give reduced fractions instead:

_composition_formula   O1 P2/5

In the case of partial occupancy, Sidney's example could be given

_composition_formula   O1 Pb1/3 Ti* Zr*
     Ti   0.17 0.20
     Zr   0.13 0.17

The composition range loop violates the principle 6 above but
there seems to be no way around this.  On the other hand the item
_composition_formula could be searched with * acting as a wild
card which could be checked to see if * was in the right range if
this seemed desirable.  This method does not, however, indicate
that the sum of Ti and Zr occupancy must be 1/3.

Some of the ambiguities in the Pearson symbol could be removed by
giving the International Tables space group number which is known
in many cases.  This could be used to augment the Pearson symbol.
However the crystal centring has to be used with care since a
rhombohedral crystal might be expressed in a hexagonal setting
(with R centring) or in a rhombohedral setting (P centring) and
there are problems with centred monoclinic cells since the cell
centring can be described as either B or C depending on the
setting used and centred settings should never be used for
triclinic crystals.  The atom count should therefore apply to the
primitive unit cell in order that the symbol be unambiguous.  The
crystal system is implicit in the space group, but may be
usefully carried in cases where the space group is not known.

_crystal_system          m
_atom_count              14
_space_group_number      15

There are still some ambiguities even with the space group number
since P31 (144) and P32 (145) are enantiomorphs and a given
structure might be arbitrarily assigned to either.  We would have
to ensure that only one of these numbers, say 144, was allowed to
be used.  A different field should be used to define the
chirality if it is important and known because one would not
otherwise know whether the space group assignment reflected the
true chirality or not.

As Jean-Claude points out the Pearson symbol can be used for
crystalline systems but not for quasi- or non-crystalline
systems.  Do we need a flag to indicate what kind of material we
are identifying, crystal, glass, liquid crystal, quasicrystal
etc.?  What sort of identifier would be useful in non-crystalline

     To summarize, I propose an identifier composed of a number
of fields, some of which may be defaulted if the information is
not available.  Among these are:

_crystal_system          m
_space_group_number      15
_atom_count              14
_composition_formula     Ca1/5 Cr1/5 F
_CAS_number         ?
     ?    ?    ?

While these fields should produce a unique identifier for quite a
number of compounds, they do not cover all possibilities and we
should identify other fields that would be more useful for other


Dr.I.David Brown,  Professor Emeritus
Brockhouse Institute for Materials Research,
McMaster University, Hamilton, Ontario, Canada
Tel: 1-(905)-525-9140 ext 24710
Fax: 1-(905)-521-2773

Reply to: [list | sender only]

Copyright © International Union of Crystallography

IUCr Webmaster