A Data Mining Environment for Modeling the Performance of Scientific Software

Complex problems, whether scientific or engineering, are most often solved today by utilizing public domain or commercial libraries or some form of problem solving environment. The task of “selecting” the best software for a targeted application or computation is often difficult and sometimes even intractable. We have proposed an approach for dealing with this issue by “mining” performance data of scientific software to generate knowledge that can be used to select software for a particular scientific problem, assuming some computational objectives. In this chapter we describe a framework together with its software implementation for mining performance data of scientific software and using the results to generate knowledge necessary to solve the software selection problem.

[1]  David Haussler,et al.  Mining scientific data , 1996, CACM.

[2]  Philip H. Ramsey Nonparametric Statistical Methods , 1974, Technometrics.

[3]  John R. Rice,et al.  High order methods for elliptic partial differential equations with singularities , 1982 .

[4]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[5]  John R. Rice,et al.  PELLPACK: a problem-solving environment for PDE-based applications on multicomputer platforms , 1998, TOMS.

[6]  Vassilios Verykios Knowledge discovery in scientific databases , 1999 .

[7]  John R. Rice,et al.  Mining the performance of complex systems , 1999, Proceedings 1999 International Conference on Information Intelligence and Systems (Cat. No.PR00446).

[8]  John R. Rice,et al.  PYTHIA: a knowledge-based system to select scientific algorithms , 1996, TOMS.

[9]  John R. Rice,et al.  A knowledge discovery methodology for the performance evaluation of scientific software , 2000, Neural Parallel Sci. Comput..

[10]  Saso Dzeroski,et al.  Inductive Logic Programming and Knowledge Discovery in Databases , 1996, Advances in Knowledge Discovery and Data Mining.

[11]  Stephen Muggleton,et al.  Efficient Induction of Logic Programs , 1990, ALT.

[12]  John R. Rice,et al.  ATHENA: A Knowledge Base System for //ELLPACK , 1990 .

[13]  John R. Rice,et al.  A Population of Linear, Second Order, Elliptic Partial Differential Equations on Rectangular Domains. Part 1 , 1981 .