Information Extraction and Graph Representation for the Design of Formulated Products

Formulated products like cosmetics, personal and household care, and pharmaceutical products are ubiquitous in everyday life. The multi-billion-dollar formulated products industry depends primarily on experiential knowledge for the design of new products. Vast knowledge of formulation ingredients and recipes exists in offline and online resources. Experts often use rudimentary searches over this data to find ingredients and construct recipes. This state of the art leads to considerable time to market and cost. We present an approach for formulated product design that enables extraction, storage, and non-trivial search of details required for product variant generation. Our contributions are threefold. First, we show how various information extraction techniques can be used to extract ingredients and recipe actions from textual sources. Second, we describe how to store this highly connected information as a graph database with an extensible domain model. And third, we demonstrate an aid to experts in putting together a new product based on non-trivial search. In an ongoing proof of concept, we use 410 formulations of various cosmetic creams to demonstrate these capabilities with promising results.

[1]  Andrew McCallum,et al.  Automatically Extracting Action Graphs from Materials Science Synthesis Procedures , 2017, ArXiv.

[2]  Tao Hong,et al.  The Chemical and Products Database, a resource for exposure-relevant data on chemicals in consumer products , 2018, Scientific Data.

[3]  Xiang Zhang,et al.  An integrated framework for designing formulated products , 2017, Comput. Chem. Eng..

[4]  Doina Caragea,et al.  Graph Databases , 2019, Encyclopedia of Big Data Technologies.

[5]  Rafiqul Gani,et al.  Product design - Molecules, devices, functional products, and formulated products , 2015, Comput. Chem. Eng..

[6]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[7]  A. McCallum,et al.  Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning , 2017 .

[8]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[9]  Mauricio Camargo,et al.  Incorporation of heuristic knowledge in the optimal design of formulated products: Application to a cosmetic emulsion , 2019, Comput. Chem. Eng..

[10]  Christianto Wibowo,et al.  Product‐centered processing: Manufacture of chemical‐based consumer products , 2002 .

[11]  Ernest W. Flick Antiperspirants and Deodorants , 1990 .

[12]  Ka Yip Fung,et al.  Advances in chemical product design , 2017 .

[13]  Ernest W. Flick,et al.  Cosmetic and toiletry formulations , 1984 .

[14]  Stefanie Jegelka,et al.  Virtual screening of inorganic materials synthesis parameters with deep learning , 2017, npj Computational Materials.

[15]  Yejin Choi,et al.  Mise en Place: Unsupervised Interpretation of Instructional Recipes , 2015, EMNLP.

[16]  Rafiqul Gani,et al.  Design of Formulated Products: A Systematic Methodology , 2011 .

[17]  King Lun Choy,et al.  A knowledge-based ingredient formulation system for chemical product development in the personal care industry , 2014, Comput. Chem. Eng..

[18]  Michael Hill,et al.  Chemical Product Engineering - The third paradigm , 2009, Comput. Chem. Eng..

[19]  Fernando P. Bernardo,et al.  A conceptual model for chemical product design , 2015 .