Comparing mathematical and heuristic approaches for scientific data analysis

Abstract Scientific data is often analyzed in the context of domain-specific problems, for example, failure diagnostics, predictive analysis, and computational estimation. These problems can be solved using approaches such as mathematical models or heuristic methods. In this paper we compare a heuristic approach based on mining stored data with a mathematical approach based on applying state-of-the-art formulae to solve an estimation problem. The goal is to estimate results of scientific experiments given their input conditions. We present a comparative study based on sample space, time complexity, and data storage with respect to a real application in materials science. Performance evaluation with real materials science data is also presented, taking into account accuracy and efficiency. We find that both approaches have their pros and cons in computational estimation. Similar arguments can be applied to other scientific problems such as failure diagnostics and predictive analysis. In the estimation problem in this paper, heuristic methods outperform mathematical models.

[1]  James V. Beck,et al.  Inverse Heat Conduction , 2023 .

[2]  Carolina Ruiz,et al.  Learning semantics-preserving distance metrics for clustering graphical data , 2005, MDM '05.

[3]  Matthew O. Ward,et al.  Apriori algorithm and game-of-life for predictive analysis in materials science , 2004, Int. J. Knowl. Based Intell. Eng. Syst..

[4]  M. Maniruzzaman,et al.  Heat transfer coefficients for quenching process simulation , 2004 .

[5]  Pearl Pu,et al.  Opportunistic Search with Semantic Fisheye Views , 2004, WISE.

[6]  Daniel A. Keim,et al.  Similarity search in multimedia databases , 2004, Proceedings. 20th International Conference on Data Engineering.

[7]  Cheng-Hung Huang,et al.  A three-dimensional inverse problem in imaging the local heat transfer coefficients for plate finned-tube heat exchangers , 2003 .

[8]  Richard Craig Van Nostrand,et al.  Design of Experiments Using the Taguchi Approach: 16 Steps to Product and Process Improvement , 2002, Technometrics.

[9]  James D. Hollan,et al.  Image representations for accessing and organizing Web information , 2000, IS&T/SPIE Electronic Imaging.

[10]  D. Keim,et al.  What Is the Nearest Neighbor in High Dimensional Spaces? , 2000, VLDB.

[11]  JOHANNES GEHRKE,et al.  RainForest—A Framework for Fast Decision Tree Construction of Large Datasets , 1998, Data Mining and Knowledge Discovery.

[12]  Kamalendu Pal,et al.  An application of rule-based and case-based reasoning within a single legal knowledge-based system , 1997, DATB.

[13]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[14]  David Leake,et al.  Case-Based Reasoning: Experiences, Lessons and Future Directions , 1996 .

[15]  Agnar Aamodt,et al.  Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..

[16]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[17]  J. M. Coulson,et al.  Heat Transfer , 2018, A Concise Manual of Engineering Thermodynamics.

[18]  Allen Newell,et al.  Chess-Playing Programs and the Problem of Complexity , 1958, IBM J. Res. Dev..

[19]  Aparna S. Varde,et al.  Graphical Data Mining for Computational Estimation in Materials Science Applications , 2006 .

[20]  Carolina Ruiz,et al.  Integrating Clustering and Classification for Estimating Process Variables in Materials Science , 2006, AAAI.

[21]  Matthew O. Ward,et al.  QuenchMinerTM: Decision Support for Optimization of Heat Treating Processes , 2003, IICAI.

[22]  Dick J. Bierman,et al.  Elicitation of Knowledge with and for 'Intelligent' Tutoring Systems , 2003 .

[23]  Richard D. Sisson,et al.  Quenching-Understanding, Controlling and Optimizing the Process , 2003 .

[24]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[25]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[26]  Shuhui Ma Characterization of the performance of mineral oil based quenchants using CHTE Quench Probe System , 2002 .

[27]  Ranjeet D Vader Development of Computer Aided Heat Treatment Planning System (CAHTPS) , 2002 .

[28]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[29]  Agnar Aamodt,et al.  CASE-BASED REASONING: FOUNDATIONAL ISSUES, METHODOLOGICAL VARIATIONS, AND SYSTEM APPROACHES AICOM - ARTIFICIAL INTELLIGENCE COMMUNICATIONS , 1994 .

[30]  Janet L. Kolodner,et al.  Case-Based Reasoning , 1989, IJCAI 1989.

[31]  J. Rissanen Stochastic complexity and the mdl principle , 1987 .

[32]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[33]  Elke A. Rundensteiner,et al.  Effectiveness of Domain-Specific Cluster Representatives for Graphical Plots , 2022 .