论文信息 - Learning understandable classifier models. - 字舞流文

Learning understandable classifier models.

LEARNING UNDERSTANDABLE CLASSIFIER MODELS

Jan Chorowski | J. Chorowski

[1] Robert A. Lordo,et al. Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[2] Wee Kheng Leow,et al. FERNN: An Algorithm for Fast Extraction of Rules from Neural Networks , 2004, Applied Intelligence.

[3] David J. Field,et al. Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[4] Bart Baesens,et al. Decompositional Rule Extraction from Support Vector Machines by Active Learning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[5] Martin Fodslette Møller,et al. A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[6] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[7] Yann LeCun,et al. Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8] Pedro M. Domingos. Knowledge Acquisition from Examples Via Multiple Models , 1997 .

[9] Donald C. Wunsch,et al. Neural network explanation using inversion , 2007, Neural Networks.

[10] Alberto Maria Segre,et al. Programs for Machine Learning , 1994 .

[11] Tin Kam Ho,et al. The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[12] Randal E. Bryant. Binary decision diagrams and beyond: enabling technologies for formal verification , 1995, ICCAD.

[13] Christel Baier,et al. A uniform framework for weighted decision diagrams and its implementation , 2008, International Journal on Software Tools for Technology Transfer.

[14] Ron Kohavi,et al. Bottom-Up Induction of Oblivious Read-Once Decision Graphs: Strengths and Limitations , 1994, AAAI.

[15] Marc'Aurelio Ranzato,et al. Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[16] Bart Baesens,et al. ITER: An Algorithm for Predictive Regression Rule Extraction , 2006, DaWaK.

[17] Jude W. Shavlik,et al. Extracting refined rules from knowledge-based neural networks , 2004, Machine Learning.

[18] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[19] Olcay Boz,et al. Extracting decision trees from trained neural networks , 2002, KDD.

[20] Ryszard S. Michalski,et al. Knowledge acquisition by encoding expert rules versus computer induction from examples: a case study involving soybean pathology , 1999, Int. J. Hum. Comput. Stud..

[21] Jacek M. Zurada,et al. Top-Down Induction of Reduced Ordered Decision Diagrams from Neural Networks , 2011, ICANN.

[22] Bart Baesens,et al. Using Rule Extraction to Improve the Comprehensibility of Predictive Models , 2006 .

[23] Vladimir Vapnik,et al. An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[24] Daniel Rivero,et al. A New Approach to the Extraction of ANN Rules and to Their Generalization Capacity Through GP , 2004, Neural Computation.

[25] Donald F. Specht,et al. Probabilistic neural networks , 1990, Neural Networks.

[26] R. Tibshirani,et al. Least angle regression , 2004, math/0406456.

[27] Matti Pietikäinen,et al. A comparative study of texture measures with classification based on featured distributions , 1996, Pattern Recognit..

[28] Jenq-Neng Hwang,et al. Nonparametric multivariate density estimation: a comparative study , 1994, IEEE Trans. Signal Process..

[29] Randal E. Bryant,et al. Verification of Arithmetic Circuits with Binary Moment Diagrams , 1995, 32nd Design Automation Conference.

[30] Aapo Hyvärinen,et al. Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[31] Wei Jia,et al. Discriminant sparse neighborhood preserving embedding for face recognition , 2012, Pattern Recognit..

[32] Johannes Fürnkranz,et al. ROC ‘n’ Rule Learning—Towards a Better Understanding of Covering Algorithms , 2005, Machine Learning.

[33] Bart Baesens,et al. Decision Diagrams in Machine Learning: An Empirical Study on Real-Life Credit-Risk Data , 2004, Diagrams.

[34] Robert K. Brayton,et al. Heuristic Minimization of BDDs Using Don't Cares , 1994, 31st Design Automation Conference.

[35] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[36] Randal E. Bryant,et al. Verification of arithmetic circuits using binary moment diagrams , 2001, International Journal on Software Tools for Technology Transfer.

[37] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[38] Sebastian Thrun,et al. The MONK''s Problems-A Performance Comparison of Different Learning Algorithms, CMU-CS-91-197, Sch , 1991 .

[39] William W. Cohen. Fast Effective Rule Induction , 1995, ICML.

[40] Bart Baesens,et al. Recursive Neural Network Rule Extraction for Data With Mixed Attributes , 2008, IEEE Transactions on Neural Networks.

[41] Rudy Setiono. Extracting M-of-N rules from trained neural networks , 2000, IEEE Trans. Neural Networks Learn. Syst..

[42] J. Ross Quinlan,et al. Learning logical definitions from relations , 1990, Machine Learning.

[43] D. Hubel,et al. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[44] Massoud Pedram,et al. Factored Edge-Valued Binary Decision Diagrams , 1997, Formal Methods Syst. Des..

[45] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[46] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[47] Fu Jie Huang,et al. A Tutorial on Energy-Based Learning , 2006 .

[48] M. C. Jones,et al. A Brief Survey of Bandwidth Selection for Density Estimation , 1996 .

[49] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[50] Jacek M. Zurada,et al. Toward Better Understanding of Protein Secondary Structure: Extracting Prediction Rules , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[51] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[52] Bu. Park,et al. Rejoinder to ``Practical performance of several data driven bandwidth selectors" , 1992 .

[53] Jacek M. Zurada,et al. Obtaining Full Regularization Paths for Robust Sparse Coding with Applications to Face Recognition , 2012, 2012 11th International Conference on Machine Learning and Applications.

[54] Joachim Diederich,et al. Eclectic Rule-Extraction from Support Vector Machines , 2005 .

[55] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[56] Ian H. Witten,et al. Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[57] Mark Craven,et al. Rule Extraction: Where Do We Go from Here? , 1999 .

[58] R. Tibshirani,et al. �-norm Support Vector Machines , 2003 .

[59] Erkki Oja,et al. Independent component analysis: algorithms and applications , 2000, Neural Networks.

[60] Yaser S. Abu-Mostafa,et al. Learning from hints in neural networks , 1990, J. Complex..

[61] Jude W. Shavlik,et al. Extracting Thee-Structured Representations of Thained Networks , 1995 .

[62] Rich Caruana,et al. An empirical comparison of supervised learning algorithms , 2006, ICML.

[63] Ji Zhu,et al. Boosting as a Regularized Path to a Maximum Margin Classifier , 2004, J. Mach. Learn. Res..

[64] Russell Reed,et al. Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[65] Jacek M. Zurada,et al. Perturbation method for deleting redundant inputs of perceptron networks , 1997, Neurocomputing.

[66] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.

[67] Shuzo Yajima,et al. The Complexity of the Optimal Variable Ordering Problems of Shared Binary Decision Diagrams , 1993, ISAAC.

[68] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[69] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.

[70] Randal E. Bryant,et al. Efficient implementation of a BDD package , 1991, DAC '90.

[71] Jim Esch. Computational Intelligence Methods For Rule-Based Data Understanding , 2004, Proc. IEEE.

[72] P. Paatero. Least squares formulation of robust non-negative factor analysis , 1997 .

[73] Masumi Ishikawa,et al. Structural learning with forgetting , 1996, Neural Networks.

[74] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[75] Peter Sollich,et al. Probabilistic Methods for Support Vector Machines , 1999, NIPS.

[76] John Platt,et al. Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[77] Honglak Lee,et al. Unsupervised learning of hierarchical representations with convolutional deep belief networks , 2011, Commun. ACM.

[78] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.

[79] Scott Sanner,et al. Affine Algebraic Decision Diagrams (AADDs) and their Application to Structured Probabilistic Inference , 2005, IJCAI.

[80] Jochen Bern,et al. Boolean manipulation with free BDD's. First experimental results , 1994, Proceedings of European Design and Test Conference EDAC-ETC-EUROASIC.

[81] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[82] Tiziano Villa,et al. Exact Minimization of Binary Decision Diagrams Using Implicit Techniques , 1998, IEEE Trans. Computers.

[83] Yoav Freund,et al. Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[84] Guillermo Sapiro,et al. Supervised Dictionary Learning , 2008, NIPS.

[85] Henrik Reif Andersen,et al. Difference Decision Diagrams , 1999, CSL.

[86] Fabio Somenzi,et al. Symmetry detection and dynamic variable ordering of decision diagrams , 1994, ICCAD '94.

[87] Usama M. Fayyad,et al. Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[88] R. Michalski. Attributional Calculus: A Logic and Representation Language for Natural Induction , 2004 .

[89] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[90] Yung-Te Lai,et al. Edge-valued binary decision diagrams for multi-level hierarchical verification , 1992, DAC '92.

[91] Jacek M. Zurada,et al. Review and performance comparison of SVM- and ELM-based classifiers , 2014, Neurocomputing.

[92] Daniel T. Larose,et al. Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[93] Zhi-Hua Zhou,et al. Medical diagnosis with C4.5 rule preceded by artificial neural network ensemble , 2003, IEEE Transactions on Information Technology in Biomedicine.

[94] Enrico Macii,et al. Algebraic decision diagrams and their applications , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[95] H. Zou,et al. Regularization and variable selection via the elastic net , 2005 .

[96] Marc'Aurelio Ranzato,et al. Semi-supervised learning of compact document representations with deep networks , 2008, ICML '08.

[97] Paulo J. G. Lisboa,et al. Orthogonal search-based rule extraction (OSRE) for trained neural networks: a practical and efficient approach , 2006, IEEE Transactions on Neural Networks.

[98] Peter Clark,et al. The CN2 Induction Algorithm , 1989, Machine Learning.

[99] P. J. Green,et al. Density Estimation for Statistics and Data Analysis , 1987 .

[100] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[101] H. Andersen. An Introduction to Binary Decision Diagrams , 1997 .

[102] Alberto L. Sangiovanni-Vincentelli,et al. Learning Complex Boolean Functions: Algorithms and Applications , 1993, NIPS.

[103] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[104] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[105] Chee Kheong Siew,et al. Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[106] Patrik O. Hoyer,et al. Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[107] Geoffrey E. Hinton,et al. Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.

[108] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..

[109] Mee Young Park,et al. L1‐regularization path algorithm for generalized linear models , 2007 .

[110] Bart Baesens,et al. Using Neural Network Rule Extraction and Decision Tables for Credit - Risk Evaluation , 2003, Manag. Sci..

[111] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.

[112] Giovanna Castellano,et al. An iterative pruning algorithm for feedforward neural networks , 1997, IEEE Trans. Neural Networks.

[113] Peter A. Beerel,et al. Safe BDD minimization using don't cares , 1997, DAC.

[114] Jude Shavlik,et al. Refinement ofApproximate Domain Theories by Knowledge-Based Neural Networks , 1990, AAAI.

[115] Marc'Aurelio Ranzato,et al. A Unified Energy-Based Framework for Unsupervised Learning , 2007, AISTATS.

[116] Ron Kohavi,et al. Oblivious Decision Trees, Graphs, and Top-Down Pruning , 1995, IJCAI.

[117] S. Rosset,et al. Piecewise linear regularized solution paths , 2007, 0708.2197.

[118] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Feature Hierarchies , 2009 .

[119] Ke Huang,et al. Sparse Representation for Signal Classification , 2006, NIPS.

[120] Marek A. Perkowski,et al. Multi-valued functional decomposition as a machine learning method , 1998, Proceedings. 1998 28th IEEE International Symposium on Multiple- Valued Logic (Cat. No.98CB36138).

[121] Joydeep Ghosh,et al. Symbolic Interpretation of Artificial Neural Networks , 1999, IEEE Trans. Knowl. Data Eng..

[122] Jude W. Shavlik,et al. Using Sampling and Queries to Extract Rules from Trained Neural Networks , 1994, ICML.

[123] M. R. Osborne,et al. On the LASSO and its Dual , 2000 .

[124] Jochen Bern,et al. Some heuristics for generating tree-like FBDD types , 1996, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[125] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.

[126] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[127] Ying Wu,et al. Robust 3D Action Recognition with Random Occupancy Patterns , 2012, ECCV.

[128] E BryantRandal. Graph-Based Algorithms for Boolean Function Manipulation , 1986 .

[129] Robert Tibshirani,et al. 1-norm Support Vector Machines , 2003, NIPS.

[130] Madhuri Jha. ANN-DT : An Algorithm for Extraction of Decision Trees from Artificial Neural Networks , 2013 .

[131] Guillermo Sapiro,et al. Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[132] Marc'Aurelio Ranzato,et al. Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[133] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[134] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[135] Fabio Somenzi,et al. Efficient manipulation of decision diagrams , 2001, International Journal on Software Tools for Technology Transfer.

[136] Bart Baesens,et al. Minerva: Sequential Covering for Rule Extraction , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[137] T.,et al. Training Feedforward Networks with the Marquardt Algorithm , 2004 .

[138] Hendrik Blockeel,et al. Seeing the Forest Through the Trees: Learning a Comprehensible Model from an Ensemble , 2007, ECML.

[139] LiMin Fu,et al. Rule Generation from Neural Networks , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[140] Jacek M. Zurada,et al. Introduction to artificial neural systems , 1992 .

[141] Ingo Wegener,et al. On the complexity of minimizing the OBDD size for incompletely specified functions , 1996, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[142] Gregory J. Wolff,et al. Optimal Brain Surgeon and general network pruning , 1993, IEEE International Conference on Neural Networks.

[143] Richard Rudell. Dynamic variable ordering for ordered binary decision diagrams , 1993, ICCAD.

[144] Joachim Diederich,et al. Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..

[145] J. Simonoff. Multivariate Density Estimation , 1996 .

[146] Xiaoyang Tan,et al. Pattern Recognition , 2016, Communications in Computer and Information Science.

[147] Brian R. Gaines,et al. Transforming Rules and Trees into Comprehensible Knowledge Structures , 2000 .

[148] Jacek M. Zurada,et al. Extracting Rules From Neural Networks as Decision Diagrams , 2011, IEEE Transactions on Neural Networks.

[149] Jorge Nocedal,et al. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[150] Allen Y. Yang,et al. Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[151] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.

[152] Alberto L. Sangiovanni-Vincentelli,et al. Using the minimum description length principle to infer reduced ordered decision graphs , 1996, Machine Learning.

[153] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[154] Vladimir Cherkassky,et al. Learning from Data: Concepts, Theory, and Methods , 1998 .