A conventional representation of a periodic crystal by its primitive unit cell and motif is well-known to be ambiguous. Indeed, any crystal can be generated from infinitely many primitive unit cells and motifs containing differently located atoms. Niggli ’s reduced cell is unique but discontinuous under perturbations. Continuity of crystal representations is important for filtering out near duplicates in big datasets [1, Fig. 2d] of simulated crystals in Crystal Structure Prediction (CSP). Symmetry groups and many other descriptors discontinuously change under perturbations. So CSP landscapes are plotted only by two coordinates: the lattice energy and density. We describe a new geometric approach to generating a unique code (called a crystal isoset ) of any periodic crystal, which continuously changes under perturbations of atoms [2-3]. This isoset is a material genome or a DNA-type code that allows an inverse design of new periodic crystals. Using these complete isosets, one can compute invariants via density functions [4] and interatomic distances [5]. For any crystal dataset irrespective of symmetries or chemical compositions, invariants of crystals can be joined in a minimum spanning tree via continuous distances that quantify crystal similarities. Our Python code of distance-based invariants produced a map of all 229K organic crystals in the Cambridge Structural Database overnight on a modest desktop [6 (appendix D), 7].
[1]
H. Edelsbrunner,et al.
The Density Fingerprint of a Periodic Point Set
,
2021,
SoCG.
[2]
V. Kurlin,et al.
Introduction to Periodic Geometry and Topology
,
2021,
ArXiv.
[3]
A. Cooper,et al.
Average Minimum Distances of periodic point sets.
,
2020
.
[4]
Christopher M. Kane,et al.
Functional materials discovery using energy–structure–function maps
,
2017,
Nature.
[5]
Vitaliy Kurlin,et al.
Pointwise distance distributions of periodic sets
,
2021,
ArXiv.
[6]
Vitaliy Kurlin,et al.
An Isometry Classification of Periodic Point Sets
,
2021,
DGMM.