On pattern matching of X-ray powder diffraction data.

We introduce a novel pattern matching algorithm optimized for X-ray powder diffraction (XRPD) data and useful for data from other types of analytical techniques (e.g., Raman, IR). The algorithm is based on hierarchical clustering with a similarity metric that compares peak positions using the full peak profile. It includes heuristics developed from years of experience manually matching XRPD data, and preprocessing algorithms that reduce the effects of common problems associated with XRPD (e.g., preferred orientation and poor particle statistics). This algorithm can find immediate application in automated polymorph screening and salt selection, common tasks in the development of pharmaceuticals.