Monotone models for prediction in data mining

This dissertation studies the incorporation of monotonicity constraints as a type of domain knowledge into a data mining process. Monotonicity constraints are enforced at two stages?data preparation and data modeling. The main contributions of the research are a novel procedure to test the degree of monotonicity of a real data set, a greedy algorithm to transform non-monotone into monotone data, and extended and novel approaches for building monotone decision models. The results from simulation and real case studies show that enforcing monotonicity can considerably improve knowledge discovery and facilitate the decision-making process for end-users by deriving more accurate, stable and plausible decision models.

[1]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[2]  H. Daniels,et al.  Derivation of Monotone Decision Models from Non-Monotone Data , 2003 .

[3]  Steven Stern,et al.  Feasible Nonparametric Estimation of Multiargument Monotone Functions , 1994 .

[4]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[5]  Dana Ron,et al.  Testing Monotonicity , 2000, Comb..

[6]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[7]  Dick den Hertog,et al.  Discrete least-norm approximation by nonnegative (trigonometric) polynomials and rational functions , 2008 .

[8]  Robin Sibson,et al.  What is projection pursuit , 1987 .

[9]  Marina Velikova,et al.  Mixtures of Monotone Networks for Prediction , 2006 .

[10]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[11]  Jørgen Karpf,et al.  Inductive modelling in law: example based expert systems in administrative law , 1991, ICAIL '91.

[12]  Bernard De Baets,et al.  Growing decision trees in an ordinal setting , 2003, Int. J. Intell. Syst..

[13]  Toshihide Ibaraki,et al.  Data Analysis by Positive Decision Trees , 1999, CODAS.

[14]  Mayte Suárez-Fariñas,et al.  Mixture of Experts and Local-Global Neural Networks , 2003, ESANN.

[15]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[16]  Andreas Stafylopatis,et al.  A divide-and-conquer method for multi-net classifiers , 2003, Pattern Analysis & Applications.

[17]  H. Daniels,et al.  Derivation of monotone decision models from noisy data , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[18]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[19]  Hoang Tuy,et al.  Monotonic Optimization: Problems and Solution Approaches , 2000, SIAM J. Optim..

[20]  Michael Pidd,et al.  Tools for Thinking—Modelling in Management Science , 1997 .

[21]  Joseph L. Hellerstein,et al.  A statistical approach to diagnosing intermittent performance-problems using monotone relationships , 1989, SIGMETRICS '89.

[22]  J. Bioch,et al.  Monotone Decision Trees and Noisy Data , 2002 .

[23]  David L. Olson,et al.  Rule induction in data mining: effect of ordinal scales , 2002, Expert Syst. Appl..

[24]  Shimon Even,et al.  Graph Algorithms , 1979 .

[25]  Douglas R. Vogel,et al.  MIS research: a profile of leading journals and universities , 1984, DATB.

[26]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[27]  R. Möhring Algorithmic Aspects of Comparability Graphs and Interval Graphs , 1985 .

[28]  Extension of CART using multiple splits under order restrictions , 2003 .

[29]  Paolo Giudici,et al.  Applied Data Mining: Statistical Methods for Business and Industry , 2003 .

[30]  Ronitt Rubinfeld,et al.  Monotonicity testing over general poset domains , 2002, STOC '02.

[31]  J. Hammersley,et al.  Monte Carlo Methods , 1965 .

[32]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[33]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[34]  Shouhong Wang,et al.  Application of the Back Propagation Neural Network Algorithm with Monotonicity Constraints for Two‐Group Classification Problems* , 1993 .

[35]  D. S. Yeung,et al.  Monotonic decision tree for ordinal classification , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).

[36]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[37]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[38]  Ad Feelders,et al.  Statistical concepts , 2003 .

[39]  Joseph Sill,et al.  Monotonic Networks , 1997, NIPS.

[40]  A. J. Feelders,et al.  Pruning for Monotone Classification Trees , 2003, IDA.

[41]  Larry Nazareth,et al.  A family of variable metric updates , 1977, Math. Program..

[42]  D. Shanno Conditioning of Quasi-Newton Methods for Function Minimization , 1970 .

[43]  G. S. Watson,et al.  Smooth regression analysis , 1964 .

[44]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[45]  A. Feeldersa,et al.  Methodological and practical aspects of data mining , 2000 .

[46]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[47]  E. Nadaraya On Estimating Regression , 1964 .

[48]  Leon Sterling,et al.  Learning and classification of monotonic ordinal concepts , 1989, Comput. Intell..

[49]  A. J. Feelders Prior Knowledge in Economic Applications of Data Mining , 2000, PKDD.

[50]  Karen A. F. Copeland Experiments: Planning, Analysis, and Parameter Design Optimization , 2002 .

[51]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[52]  R. Fletcher,et al.  A New Approach to Variable Metric Algorithms , 1970, Comput. J..

[53]  H. D. Brunk,et al.  AN EMPIRICAL DISTRIBUTION FUNCTION FOR SAMPLING WITH INCOMPLETE INFORMATION , 1955 .

[54]  Dieter Jungnickel,et al.  Graphs, Networks, and Algorithms , 1980 .

[55]  Joseph Sill,et al.  Monotonicity Hints , 1996, NIPS.

[56]  Jack P. C. Kleijnen,et al.  Design and Analysis of Monte Carlo Experiments , 2004 .

[57]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[58]  H. Daniels,et al.  Application of MLP Networks to Bond Rating and House Pricing , 1999, Neural Computing & Applications.

[59]  Viara Popova,et al.  Knowledge Discovery and Monotonicity , 2004 .

[60]  Peter Lory,et al.  Neural Networks for Two-Group Classification Problems with Monotonicity Hints , 2000 .

[61]  D. R. Fulkerson,et al.  Flows in Networks. , 1964 .

[62]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[63]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[64]  David Gamarnik Efficient learning of monotone concepts via quadratic optimization , 1998, COLT' 98.

[65]  A. Ben-David Monotonicity Maintenance in Information-Theoretic Machine Learning Algorithms , 1995, Machine Learning.

[66]  David K. Smith Theory of Linear and Integer Programming , 1987 .

[67]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[68]  F. T. Wright,et al.  Order restricted statistical inference , 1988 .

[69]  L. Ungar,et al.  Estimating Monotonic Functions and Their Bounds , 1999 .

[70]  Ashraf M. Abdelbar,et al.  Achieving superior generalisation with a high order neural network , 1998, Neural Computing & Applications.

[71]  C. G. Broyden The Convergence of a Class of Double-rank Minimization Algorithms 2. The New Algorithm , 1970 .

[72]  M. Schell,et al.  The Reduced Monotonic Regression Method , 1997 .

[73]  Marina Velikova,et al.  Decision trees for monotone price models , 2004, Comput. Manag. Sci..

[74]  Marina Velikova,et al.  Solving Partially Monotone Problems with Neural Networks , 2007 .

[75]  Farooq Ashraf,et al.  Preserving monotonic shape of the data using piecewise rational cubic functions , 1997, Comput. Graph..

[76]  Shouhong Wang,et al.  A neural network method of density estimation for univariate unimodal data , 1994, Neural Computing & Applications.

[77]  William W. Armstrong,et al.  Adaptive Logic Networks , 2005 .

[78]  Shouhong Wang,et al.  Learning Bias in Neural Networks and an Approach to Controlling Its Effect in Monotonic Classification , 1993, IEEE Trans. Pattern Anal. Mach. Intell..