Constraint Reasoning and Kernel Clustering for Pattern Decomposition with Scaling

Motivated by an important and challenging task encountered in material discovery, we consider the problem of finding K basis patterns of numbers that jointly compose N observed patterns while enforcing additional spatial and scaling constraints. We propose a Constraint Programming (CP) model which captures the exact problem structure yet fails to scale in the presence of noisy data about the patterns. We alleviate this issue by employing Machine Learning (ML) techniques, namely kernel methods and clustering, to decompose the problem into smaller ones based on a global data-driven view, and then stitch the partial solutions together using a global CP model. Combining the complementary strengths of CP and ML techniques yields a more accurate and scalable method than the few found in the literature for this complex problem.

[1]  Wei Dong,et al.  PolySNAP3: a computer program for analysing and visualizing high-throughput data from diffraction and spectroscopic sources , 2009 .

[2]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[3]  Manuel Moliner,et al.  Design of a full-profile-matching solution for high-throughput analysis of multiphase samples through powder X-ray diffraction. , 2009, Chemistry.

[4]  Patrick Prosser,et al.  A Connectivity Constraint Using Bridges , 2006, ECAI.

[5]  Agnar Aamodt,et al.  Explanation in Case-Based Reasoning–Perspectives and Goals , 2005, Artificial Intelligence Review.

[6]  M. Blamire,et al.  High throughput thin film materials science , 2008 .

[7]  Peter J. Stuckey,et al.  Solving Set Constraint Satisfaction Problems using ROBDDs , 2005, J. Artif. Intell. Res..

[8]  I. Takeuchi,et al.  Rapid structural mapping of ternary metallic alloy systems using the combinatorial approach and cluster analysis. , 2007, The Review of scientific instruments.

[9]  Alexander Kazimirov,et al.  High energy x-ray diffraction/x-ray fluorescence spectroscopy for high-throughput analysis of composition spread thin films. , 2009, The Review of scientific instruments.

[10]  Theodoros Damoulas,et al.  Bayesian Classification of Flight Calls with a Novel Dynamic Time Warping Kernel , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[11]  Pascal Van Hentenryck,et al.  Length-Lex Ordering for Set CSPs , 2006, AAAI.

[12]  John M. Gregoire,et al.  Improved Fuel Cell Oxidation Catalysis in Pt1−xTax† , 2010 .

[13]  I Takeuchi,et al.  Rapid identification of structural phases in combinatorial thin-film libraries using x-ray diffraction and non-negative matrix factorization. , 2009, The Review of scientific instruments.

[14]  Mikkel Thorup,et al.  Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity , 1998, STOC '98.

[15]  Laurence A. Wolsey,et al.  Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems, 4th International Conference, CPAIOR 2007, Brussels, Belgium, May 23-26, 2007, Proceedings , 2007, CPAIOR.

[16]  Radislav A. Potyrailo,et al.  Combinatorial and High-Throughput Discovery and Optimization of Catalysts and Materials , 2006 .

[17]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[18]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[19]  C. Gomes Computational Sustainability: Computational methods for a sustainable environment, economy, and society , 2009 .

[20]  Pascal Van Hentenryck,et al.  The Steel Mill Slab Design Problem Revisited , 2008, CPAIOR.