Learning in the “Real World”

In this paper we define and characterize the process of developing a “real-world” Machine Learning application, with its difficulties and relevant issues, distinguishing it from the popular practice of exploiting ready-to-use data sets. To this aim, we analyze and summarize the lessons learned from applying Machine Learning techniques to a variety of problems. We believe that these lessons, though primarily based on our personal experience, can be generalized to a wider range of situations and are supported by the reported experiences of other researchers.

[1]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[2]  S Frediani,et al.  Knowledge base organization in expert systems , 1986, ISMIS '86.

[3]  Filippo Neri,et al.  An Analysis of the Universal Suffrage Selection Operator , 1996, Evolutionary Computation.

[4]  Francesco Bergadano,et al.  A Knowledge Intensive Approach to Concept Induction , 1988, ML Workshop.

[5]  Ryszard S. Michalski,et al.  A theory and methodology of inductive learning , 1993 .

[6]  P Chambon,et al.  Organization and expression of eucaryotic split genes coding for proteins. , 1981, Annual review of biochemistry.

[7]  Haym Hirsh,et al.  Inductive learning for engineering design optimization , 1996, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[8]  Edward C. Uberbacher,et al.  GRAIL: a multi-agent neural network system for gene identification , 1996, Proc. IEEE.

[9]  Lawrence Hunter,et al.  Finding Relevant Biomolecular Features , 1993, ISMB.

[10]  Ron Kohavi,et al.  Option Decision Trees with Majority Votes , 1997, ICML.

[11]  Filippo Neri,et al.  Exploring the Power of Genetic Search in Learning Symbolic Classifiers , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  S. Létourneau Discovering Useful Knowledge from Aircraft Operation / Maintenance Data , 1997 .

[13]  Richard Maclin,et al.  Feature Engineering and Classifier Selection: A Case Study in Venusian Volcano Detection , 1997, ICML.

[14]  Tom M. Mitchell,et al.  Does Machine Learning Really Work? , 1997, AI Mag..

[15]  D. Hestenes Toward a modeling theory of physics instruction , 1987 .

[16]  Wray L. Buntine,et al.  Intelligent Instruments: Discovering How to Turn Spectral Data into Information , 1995, KDD.

[17]  Marco Botta,et al.  FONN: Combining First Order Logic with Connectionist Learning , 1997, ICML.

[18]  A. Tiberghien Modeling as a basis for analyzing teaching-learning situations , 1994 .

[19]  Evangelos Simoudis,et al.  Mining business databases , 1996, CACM.

[20]  Carol L. Smith,et al.  Using Conceptual Models to Facilitate Conceptual Change: The Case of Weight-Density Differentiation , 1992 .

[21]  L. Saitta,et al.  Rigel: An inductive learning system , 2004, Machine Learning.

[22]  Derek H. Sleeman,et al.  Consultant-2: pre- and post-processing of Machine Learning applications , 1995, Int. J. Hum. Comput. Stud..

[23]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[24]  Donato Malerba,et al.  Multistrategy Learning for Document Recognition , 1994, Appl. Artif. Intell..

[25]  V. SyllogicB.,et al.  Industrial requirements for ML application technology , 1997 .

[26]  Gregory Piatetsky-Shapiro,et al.  The KDD process for extracting useful knowledge from volumes of data , 1996, CACM.

[27]  Ian H. Witten,et al.  Applying a Machine Learning Workbench: Experience with Agricultural Databases , 1996 .

[28]  Lorenza Saitta,et al.  Machine learning - an integrated framework and its applications , 1991, Ellis Horwood series in artificial intelligence.

[29]  AlgorithmsCarla E. Brodley The Process of Applying Machine Learning , 1995 .

[30]  Lorenza Saitta,et al.  Automatic construction of second generation diagnostic expert systems , 1991 .

[31]  James D. Slotta,et al.  Understanding constraint-based processes: A precursor to conceptual change in physics , 1996 .

[32]  Herbert A. Simon,et al.  Applications of machine learning and rule induction , 1995, CACM.

[33]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[34]  A. Giordana,et al.  ENIGMA: A System That Learns Diagnostic Knowledge , 1993, IEEE Trans. Knowl. Data Eng..

[35]  Attilio Giordana,et al.  Learning Structured Concepts Using Genetic Algorithms , 1992, ML.

[36]  Jude W. Shavlik,et al.  Training Knowledge-Based Neural Networks to Recognize Genes , 1990, NIPS.

[37]  Stephen M. Mount,et al.  A catalogue of splice junction sequences. , 1982, Nucleic acids research.

[38]  L. Asker,et al.  Building the DeNOx System : Experience from a Real-World Application of Machine Learning , 1995 .

[39]  Pat Langley Challenges for the Application of Machine Learning , 1997 .

[40]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[41]  Sankar K. Pal,et al.  Genetic Algorithms for Pattern Recognition , 2017 .

[42]  K. S. Narendra,et al.  Neural networks for control theory and practice , 1996, Proc. IEEE.

[43]  Mario Milanese,et al.  Optimization of diagnostic procedures in hepatology , 1982 .

[44]  Cristina Baroglio,et al.  Learning Controllers for Industrial Robots , 2005, Machine Learning.

[45]  Gregory Piatetsky-Shapiro Data Mining and Knowledge Discovery: The Third Generation (Extended Abstract) , 1997, ISMIS.

[46]  Abraham Silberschatz,et al.  On Subjective Measures of Interestingness in Knowledge Discovery , 1995, KDD.

[47]  Yong Lee,et al.  Technology Transfer from University to Industry , 1994 .

[48]  Lorenza Saitta,et al.  Automated Concept Acquisition in Noisy Environments , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[49]  Stella Vosniadou,et al.  Mental Models of the Day/Night Cycle , 1994, Cogn. Sci..

[50]  Marco Botta,et al.  SMART+: A Multi-Strategy Learning Tool , 1993, IJCAI.

[51]  S. Carey Conceptual Change in Childhood , 1985 .

[52]  M Milanese,et al.  Selection and assessment of laboratory tests for the evaluation of liver functional impairment. , 1985, Methods of information in medicine.

[53]  U. M. Feyyad Data mining and knowledge discovery: making sense out of data , 1996 .

[54]  Lorenza Saitta,et al.  Learning Disjunctive Concepts by Means of Genetic Algorithms , 1994, ICML.

[55]  Ron Kohavi,et al.  MineSet: An Integrated System for Data Mining , 1997, KDD.

[56]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[57]  A. diSessa Toward an Epistemology of Physics , 1993 .

[58]  Filippo Neri,et al.  Search-Intensive Concept Induction , 1995, Evolutionary Computation.

[59]  Enrico Blanzieri,et al.  Learning Radial Basis Function Networks On-line , 1996, International Conference on Machine Learning.

[60]  Donato Malerba,et al.  Empirical learning methods for digitized document recognition: an integrated approach to inductive generalization , 1990, Sixth Conference on Artificial Intelligence for Applications.

[61]  Lorenza Saitta,et al.  A semiautomated methodology for knowledge elicitation , 1993, IEEE Trans. Syst. Man Cybern..

[62]  Guido Lindner,et al.  Ml and Statistics for Trend Prognosis of Complaints in the Automobile Industry , 1997 .

[63]  Jeff A. Johnson,et al.  Technology transfer: so much research, so few good products , 1996, CHI Conference Companion.

[64]  Roy Rada,et al.  Machine learning - applications in expert systems and information retrieval , 1986, Ellis Horwood series in artificial intelligence.

[65]  Richard O. Mason,et al.  Applying ethics to information technology issues , 1995, CACM.

[66]  Lorenza Saitta,et al.  A Coevolutionary Approach to Concept Learning , 1997, ISMIS.

[67]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[68]  Åsa Rudström,et al.  Applications of Machine Learning , 2020, Algorithms for Intelligent Systems.

[69]  David Haussler,et al.  KDD for Science Data Analysis: Issues and Examples , 1996, KDD.

[70]  Douglas H. Fisher,et al.  Overcoming process delays with decision tree induction , 1994, IEEE Expert.

[71]  Ashwin Ram,et al.  Situation development in a complex real-world domain , 1997 .

[72]  Gholamreza Nakhaeizadeh,et al.  What Daimler-Benz has learned as an industrial partner from the Machine Learning Project StatLog , 1995 .

[73]  Bernard Widrow,et al.  Neural networks: applications in industry, business and science , 1994, CACM.

[74]  Ying Xu,et al.  Inferring Gene Structures in Genomic Sequences Using Pattern Recognition and Expressed Sequence Tags , 1997, ISMB.

[75]  F. Verdenius,et al.  Proceedings of the workshop Machine Learning Application in the Real World: Methodological Aspects and Implications, hosted by the 14th International Conference on Machine Learning (ICML-97), Nashville, USA , 1997 .

[76]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[77]  Donald Michie,et al.  Machine intelligence and related topics , 1982 .

[78]  E. Uberbacher,et al.  Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[79]  Ron Kohavi,et al.  Data Mining using MLC , 1996 .

[80]  M. Nuttin,et al.  Learning controllers for industrial robots , 1996, Machine Learning.

[81]  Ivan Bratko,et al.  Applications of inductive logic programming , 1995, SGAR.

[82]  Filippo Neri,et al.  Integrating Multiple Learning Strategies in First Order Logics , 1997, Machine Learning.

[83]  L. Mcginnis Real world. , 2002, Bulletin of the American College of Surgeons.