The role of classifiers in feature selection : number vs nature

Wrapper feature selection approaches are widely used to select a small subset of relevant features from a dataset. However, Wrappers suffer from the fact that they only use a single classifier when selecting the features. The problem of using a single classifier is that each classifier is of a different nature and will have its own biases. This means that each classifier will select different feature subsets. To address this problem, this thesis aims to investigate the effects of using different classifiers for Wrapper feature selection. More specifically, it aims to investigate the effects of using different number of classifiers and classifiers of different nature. This aim is achieved by proposing a new data mining method called Wrapper-based Decision Trees (WDT). The WDT method has the ability to combine multiple classifiers from four different families, including Bayesian Network, Decision Tree, Nearest Neighbour and Support Vector Machine, to select relevant features and visualise the relationships among the selected features using decision trees. Specifically, the WDT method is applied to investigate three research questions of this thesis: (1) the effects of number of classifiers on feature selection results; (2) the effects of nature of classifiers on feature selection results; and (3) which of the two (i.e., number or nature of classifiers) has more of an effect on feature selection results. Two types of user preference datasets derived from Human-Computer Interaction (HCI) are used with WDT to assist in answering these three research questions. The results from the investigation revealed that the number of classifiers and nature of classifiers greatly affect feature selection results. In terms of number of classifiers, the results showed that few classifiers selected many relevant features whereas many classifiers selected few relevant features. In addition, it was found that using three classifiers resulted in highly accurate feature subsets. In terms of nature of classifiers, it was showed that Decision Tree, Bayesian Network and Nearest Neighbour classifiers caused signficant differences in both the number of features selected and the accuracy levels of the features. A comparison of results regarding number of classifiers and nature of classifiers revealed that the former has more of an effect on feature selection than the latter. The thesis makes contributions to three communities: data mining, feature selection, and HCI. For the data mining community, this thesis proposes a new method called WDT which integrates the use of multiple classifiers for feature selection and decision trees to effectively select and visualise the most relevant features within a dataset. For the feature selection community, the results of this thesis have showed that the number of classifiers and nature of classifiers can truly affect the feature selection process. The results and suggestions based on the results can provide useful insight about classifiers when performing feature selection. For the HCI community, this thesis has showed the usefulness of feature selection for identifying a small number of highly relevant features for determining the preferences of different users.

[1]  David W. Aha,et al.  A Comparative Evaluation of Sequential Feature Selection Algorithms , 1995, AISTATS.

[2]  R. Quinlan,et al.  Decision tree discovery , 1999 .

[3]  Andrew W. Moore,et al.  Efficient Algorithms for Minimizing Cross Validation Error , 1994, ICML.

[4]  Qi Luo,et al.  Personalized Web Information Recommendation Algorithm Based on Support Vector Machine , 2007 .

[5]  Selwyn Piramuthu,et al.  Artificial Intelligence and Information Technology Evaluating feature selection methods for learning in data mining applications , 2004 .

[6]  Jaekyung Yang,et al.  Optimization-based feature selection with adaptive instance sampling , 2006, Comput. Oper. Res..

[7]  Xiaohui Liu,et al.  The Influences of Number and Nature of Classifiers on Consensus Feature Selection , 2008, DMIN.

[8]  Huiqing Liu,et al.  A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. , 2002, Genome informatics. International Conference on Genome Informatics.

[9]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[10]  Jack Y. Yang,et al.  A comparative study of different machine learning methods on microarray gene expression data , 2008, BMC Genomics.

[11]  Michael J. Pazzani,et al.  Learning and Revising User Profiles: The Identification of Interesting Web Sites , 1997, Machine Learning.

[12]  Sherry Y. Chen The role of individual differences and levels of learner control in hypermedia environments. , 2000 .

[13]  Daoxiong Gong,et al.  Tumor-specific gene expression patterns with gene expression profiles , 2006, Science in China Series C.

[14]  Domenec Puig,et al.  Automatic texture feature selection for image pixel classification , 2006, Pattern Recognit..

[15]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[16]  Saskia Brand-Gruwel,et al.  Information problem solving by experts and novices: analysis of a complex cognitive skill , 2005, Comput. Hum. Behav..

[17]  Petra Perner,et al.  Empirical evaluation of feature subset selection based on a real-world data set , 2000, Eng. Appl. Artif. Intell..

[18]  Siegfried Treu User Interface Design , 1994, Languages and Information Systems.

[19]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[20]  B.D. Dan,et al.  From Decision Trees to Classification Rules with Data Representing User Traffic from an e-Learning Platform , 2006, 2006 2nd International Conference on Information & Communication Technologies.

[21]  Francisco Muñoz-Leiva,et al.  Web Acceptance Model (WAM): Moderating effects of user experience , 2007, Inf. Manag..

[22]  Bart Baesens,et al.  Filter‐ versus wrapper‐based feature selection for credit scoring , 2005, Int. J. Intell. Syst..

[23]  Herbert F. Jelinek,et al.  Wrapper subset evaluation facilitates the automated detection of diabetes from heart rate variability measures , 2004 .

[24]  Nigel Ford,et al.  Individual differences, hypermedia navigation, and learning: an empirical study , 2000 .

[25]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[26]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[27]  Blaise Hanczar,et al.  Improving classification of microarray data using prototype-based feature selection , 2003, SKDD.

[28]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[29]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[30]  Richard F. E. Sutcliffe,et al.  Applying incremental tree induction to retrieval from manuals and medical texts , 2006, J. Assoc. Inf. Sci. Technol..

[31]  Kemal Polat,et al.  A novel hybrid intelligent method based on C4.5 decision tree classifier and one-against-all approach for multi-class classification problems , 2009, Expert Syst. Appl..

[32]  K. I. Ramachandran,et al.  Feature selection using Decision Tree and classification through Proximal Support Vector Machine for fault diagnostics of roller bearing , 2007 .

[33]  Daniel Nikovski,et al.  Induction of compact decision trees for personalized recommendation , 2006, SAC.

[34]  Tommy W. S. Chow,et al.  Effective Gene Selection Method With Small Sample Sets Using Gradient-Based and Point Injection Techniques , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[35]  George Kapetanios,et al.  Forecasting Using Bayesian and Information Theoretic Model Averaging: An Application to UK Inflation , 2005 .

[36]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[37]  Thy-Hou Lin,et al.  Using support vector regression to model the correlation between the clinical metastases time and gene expression profile for breast cancer , 2008, Artif. Intell. Medicine.

[38]  Min Liu,et al.  The Effect of Hypermedia Assisted Instruction on Second Language Learning , 1993 .

[39]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[40]  Stan Matwin,et al.  Parallelizing Feature Selection , 2006, Algorithmica.

[41]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[42]  Hiroyuki Honda,et al.  New cancer diagnosis modeling using boosting and projective adaptive resonance theory with improved reliable index , 2007 .

[43]  Shu-Sheng Liaw,et al.  Information retrieval from the World Wide Web: a user-focused approach based on individual experience with search engines , 2006, Comput. Hum. Behav..

[44]  Robert D. Macredie,et al.  Cognitive Modeling of Student Learning in Web-Based Instructional Programs , 2004, Int. J. Hum. Comput. Interact..

[45]  Rung Ching Chen,et al.  Web page classification based on a support vector machine using a weighted vote schema , 2006, Expert Syst. Appl..

[46]  H. Zheng,et al.  Feature selection for high dimensional data in astronomy , 2007, 0709.0138.

[47]  Sherry Y. Chen The Relationships between Individual Differences and the Quality of Learning Outcomes in Web-based Instruction , 2002 .

[48]  Thong Ngee Goh,et al.  A study of mutual information based feature selection for case based reasoning in software cost estimation , 2009, Expert Syst. Appl..

[49]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[50]  Ewa Szpunar-Huk,et al.  Classifier Building by Reduction of an Ensemble of Decision Trees to a Set of Rules , 2006, 2006 International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce (CIMCA'06).

[51]  Hsinchun Chen,et al.  Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums , 2008, TOIS.

[52]  Robert D. Macredie,et al.  Hypermedia learning and prior knowledge: domain expertise vs. system expertise , 2005, J. Comput. Assist. Learn..

[53]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[54]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[55]  Kemal Polat,et al.  The effect to diagnostic accuracy of decision tree classifier of fuzzy and k-NN based weighted pre-processing methods to diagnosis of erythemato-squamous diseases , 2006, Digit. Signal Process..

[56]  Kyriacos Chrysostomou,et al.  Wrapper Feature Selection , 2009, Encyclopedia of Data Warehousing and Mining.

[57]  Ingo Mierswa,et al.  A Hybrid Approach to Feature Selection and Generation Using an Evolutionary Algorithm , 2003 .

[58]  Juan Liu,et al.  A hybrid filter/wrapper gene selection method for microarray classification , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[59]  Gary Geunbae Lee,et al.  Information gain and divergence-based feature selection for machine learning-based text categorization , 2006, Inf. Process. Manag..

[60]  A. Ghosh On optimum choice of k in nearest neighbor classification , 2006 .

[61]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[62]  N. Sandgren,et al.  A model averaging approach for equalizing sparse communication channels , 2004, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004..

[63]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[64]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[65]  Weiguo Fan,et al.  Effective profiling of consumer information retrieval needs: a unified framework and empirical comparison , 2005, Decis. Support Syst..

[66]  Edward R. Dougherty,et al.  The peaking phenomenon in the presence of feature-selection , 2008, Pattern Recognit. Lett..

[67]  Wen-Chih Wang,et al.  Data mining for yield enhancement in semiconductor manufacturing and an empirical study , 2007, Expert Syst. Appl..

[68]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[69]  Shiu Kit Tso,et al.  Feature selection by separability assessment of input spaces for transient stability classification based on neural networks , 2004 .

[70]  Sherry Chen,et al.  A cognitive model for non-linear learning in hypermedia programmes , 2002, Br. J. Educ. Technol..

[71]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[72]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[73]  Carol Tenopir,et al.  Users' interaction with World Wide Web resources: an exploratory study using a holistic approach , 2000, Inf. Process. Manag..

[74]  Xue-wen Chen An improved branch and bound algorithm for feature selection , 2003, Pattern Recognit. Lett..

[75]  S. Sitharama Iyengar,et al.  Medical Datamining with a New Algorithm for Feature Selection and Naive Bayesian Classifier , 2007 .

[76]  Raquel Florez-Lopez,et al.  Modelling of insurers’ rating determinants. An application of machine learning techniques and statistical models , 2007 .

[77]  Ta-Cheng Chen,et al.  A study of applying data mining approach to the information disclosure for Taiwan's stock market investors , 2009, Expert Syst. Appl..

[78]  Xuesong Yan,et al.  Survey of Improving Naive Bayes for Classification , 2007, ADMA.

[79]  Li Guo,et al.  TCM-KNN scheme for network anomaly detection using feature-based optimizations , 2008, SAC '08.

[80]  Bingsheng He,et al.  Bayesian Networks for Knowledge-Based Authentication , 2007 .

[81]  J. Kittler,et al.  Feature Set Search Alborithms , 1978 .

[82]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[83]  Daoqiang Zhang,et al.  Constraint Score: A new filter method for feature selection with pairwise constraints , 2008, Pattern Recognit..

[84]  Shian-Chang Huang,et al.  Evaluation of ANN and SVM classifiers as predictors to the diagnosis of students with learning disabilities , 2008, Expert Syst. Appl..

[85]  Bernard Zenko,et al.  Is Combining Classifiers with Stacking Better than Selecting the Best One? , 2004, Machine Learning.

[86]  Chun-Lang Chang A study of applying data mining to early intervention for developmentally-delayed children , 2007, Expert Syst. Appl..

[87]  Francesco Mazzoleni,et al.  Prognostic factors of cutaneous melanoma in relation to metastasis at the sentinel lymph node: a case-controlled study. , 2008, International journal of surgery.

[88]  Tom M. Mitchell,et al.  Bayesian Network Learning with Parameter Constraints , 2006, J. Mach. Learn. Res..

[89]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[90]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[91]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[92]  Yaxin Bi,et al.  On combining classifier mass functions for text categorization , 2005, IEEE Transactions on Knowledge and Data Engineering.

[93]  David Miller,et al.  Web search strategies and human individual differences: Cognitive and demographic factors, Internet attitudes, and approaches , 2005, J. Assoc. Inf. Sci. Technol..

[94]  Harry Zhang,et al.  Full Bayesian network classifiers , 2006, ICML.

[95]  Mika Käki,et al.  Information search and re-access strategies of experienced web users , 2005, WWW '05.

[96]  Adrian E. Raftery,et al.  Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data , 2005, Bioinform..

[97]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[98]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[99]  Chenn-Jung Huang,et al.  Application of wrapper approach and composite classifier to the stock trend prediction , 2008, Expert Syst. Appl..

[100]  Geert Wets,et al.  Customer-adapted coupon targeting using feature selection , 2004, Expert Syst. Appl..

[101]  Kien A. Hua,et al.  Decision tree classifier for network intrusion detection with GA-based feature selection , 2005, ACM Southeast Regional Conference.

[102]  Riyaz Sikora,et al.  Iterative feature construction for improving inductive learning algorithms , 2009, Expert Syst. Appl..

[103]  Jesús S. Aguilar-Ruiz,et al.  Incremental wrapper-based gene selection from microarray data for cancer classification , 2006, Pattern Recognit..

[104]  Alan J. Miller Subset Selection in Regression , 1992 .

[105]  M. Chi,et al.  Gender Differences in Patterns of Searching the Web , 2003 .

[106]  Pedro Larrañaga,et al.  Filter versus wrapper gene selection approaches in DNA microarray domains , 2004, Artif. Intell. Medicine.

[107]  Mevlut Ture,et al.  Using Kaplan-Meier analysis together with decision tree methods (C&RT, CHAID, QUEST, C4.5 and ID3) in determining recurrence-free survival of breast cancer patients , 2009, Expert Syst. Appl..

[108]  Pedro M. Domingos,et al.  Learning Bayesian network classifiers by maximizing conditional likelihood , 2004, ICML.

[109]  S. S. Iyengar,et al.  An Evaluation of Filter and Wrapper Methods for Feature Selection in Categorical Clustering , 2005, IDA.

[110]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[111]  Jiawei Han,et al.  Data Mining: Concepts and Techniques, Second Edition , 2006, The Morgan Kaufmann series in data management systems.

[112]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[113]  Harm J. A. Biemans,et al.  Differences between novice and experienced users in searching information on the World Wide Web , 2000 .

[114]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[115]  James W. Neill,et al.  General lack of fit tests based on families of groupings , 2008 .

[116]  Andrew P. Bradley,et al.  Rule Extraction from Support Vector Machines: A Sequential Covering Approach , 2007, IEEE Transactions on Knowledge and Data Engineering.

[117]  Xia Li,et al.  Gene mining: a novel and powerful ensemble decision approach to hunting for disease genes using microarray expression profiling. , 2004, Nucleic acids research.

[118]  Elliot Moore,et al.  Application of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data , 2007 .

[119]  Kwong-Sak Leung,et al.  Using Evolutionary Programming and Minimum Description Length Principle for Data Mining of Bayesian Networks , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[120]  Josef Kittler,et al.  Combining classifiers: A theoretical framework , 1998, Pattern Analysis and Applications.

[121]  A. Gammerman,et al.  Application of Support Vector Machines to Fault Diagnosis and Automated Repair , 2000 .

[122]  Iwan G. J. H. Wopereis,et al.  Differences between novice and experienced users in searching information on the World Wide Web , 2000, J. Am. Soc. Inf. Sci..

[123]  Enrico Blanzieri,et al.  A multiple classifier system for early melanoma diagnosis , 2003, Artif. Intell. Medicine.

[124]  Sun-Yuan Kung,et al.  Fusion of feature selection methods for pairwise scoring SVM , 2008, Neurocomputing.

[125]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[126]  Li-Yen Shue,et al.  Data mining to aid policy making in air pollution management , 2004, Expert Syst. Appl..

[127]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[128]  C. A. Moore,et al.  Field-Dependent and Field-Independent Cognitive Styles and Their Educational Implications , 1977 .

[129]  Giovanna Castellano,et al.  Similarity-Based Fuzzy Clustering for User Profiling , 2007, 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops.

[130]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[131]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[132]  David Botstein,et al.  Variation in gene expression patterns in follicular lymphoma and the response to rituximab , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[133]  John G. Cleary,et al.  K*: An Instance-based Learner Using and Entropic Distance Measure , 1995, ICML.

[134]  Sungzoon Cho,et al.  Constructing response model using ensemble based on feature subset selection , 2006, Expert Syst. Appl..

[135]  Yen-Liang Chen,et al.  Mining typical patterns from databases , 2008, Inf. Sci..

[136]  Igor V. Tetko,et al.  Gene selection from microarray data for cancer classification - a machine learning approach , 2005, Comput. Biol. Chem..

[137]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[138]  Xiaohui Liu,et al.  Mining gene expression data , 2003 .

[139]  Chen Jian,et al.  Automatic content-based recommendation in e-commerce , 2005, 2005 IEEE International Conference on e-Technology, e-Commerce and e-Service.

[140]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[141]  Xiaohui Liu,et al.  Robust Selection of Predictive Genes via a Simple Classifier , 2006, Applied bioinformatics.

[142]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[143]  Helmut Berger,et al.  Exploiting partial decision trees for feature subset selection in e-mail categorization , 2006, SAC.

[144]  Kweku-Muata Osei-Bryson,et al.  Post-pruning in regression tree induction: An integrated approach , 2008, Expert Syst. Appl..

[145]  David Enke,et al.  The adaptive selection of financial and economic variables for use with artificial neural networks , 2004, Neurocomputing.

[146]  Chenn-Jung Huang,et al.  An intelligent learning diagnosis system for Web-based thematic learning platform , 2007, Comput. Educ..

[147]  Saeed Hashemi,et al.  Linear-time wrappers to identify atypical points: two subset generation methods , 2005, IEEE Transactions on Knowledge and Data Engineering.

[148]  D. Tarsitano,et al.  Applying Bayesian Model Averaging to mechanistic models: An example and comparison of methods , 2008, Environ. Model. Softw..

[149]  Beata Walczak,et al.  Classification of genomic data: some aspects of feature selection. , 2008, Talanta.

[150]  Riyaz Sikora,et al.  Framework for efficient feature selection in genetic algorithm based data mining , 2007, Eur. J. Oper. Res..

[151]  Michael E. Theologou,et al.  User Profile Modeling in the context of web-based learning management systems , 2008, J. Netw. Comput. Appl..

[152]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[153]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[154]  Pedro Larrañaga,et al.  Feature selection in Bayesian classifiers for the prognosis of survival of cirrhotic patients treated with TIPS , 2005, J. Biomed. Informatics.

[155]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[156]  Donald K. Wedding,et al.  Discovering Knowledge in Data, an Introduction to Data Mining , 2005, Inf. Process. Manag..

[157]  Dunja Mladenic,et al.  Feature selection on hierarchy of web documents , 2003, Decis. Support Syst..

[158]  Xi-Zhao Wang,et al.  The Infinite Polynomial Kernel for Support Vector Machine , 2005, ADMA.

[159]  Enrique Frias-Martinez,et al.  User Modelling for Digital Libraries: A Data Mining Approach , 2006 .

[160]  Yannis Manolopoulos,et al.  Data Mining techniques for the detection of fraudulent financial statements , 2007, Expert Syst. Appl..

[161]  Xiaohui Liu,et al.  The role of human factors in stereotyping behavior and perception of digital library users: a robust clustering approach , 2007, User Modeling and User-Adapted Interaction.

[162]  E. Frias-martinez,et al.  Survey of Data Mining Approaches to User Modeling for Adaptive Hypermedia , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[163]  Johannes Ruhland,et al.  Target Group Selection in Retail Banking through Neuro-Fuzzy Data Mining and Extensive Pre- and Postprocessing , 1999, DaWaK.

[164]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[165]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[166]  Arno Siebes,et al.  REPORT RAPPORT , 2022 .

[167]  M. Thelwall,et al.  A comparison of feature selection methods for an evolving RSS feed corpus , 2006, Inf. Process. Manag..

[168]  L. A. Smith,et al.  Feature Subset Selection: A Correlation Based Filter Approach , 1997, ICONIP.

[169]  Holger R. Maier,et al.  Non-linear variable selection for artificial neural networks using partial mutual information , 2008, Environ. Model. Softw..

[170]  Stephen Shaoyi Liao,et al.  A functional-dependencies-based Bayesian networks learning method and its application in a mobile commerce system , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[171]  Wei-Shen Tai,et al.  A Web User Preference Perception System Based on Fuzzy Data Mining Method , 2006, AIRS.

[172]  Charles X. Ling,et al.  The Representational Power of Discrete Bayesian Networks , 2002, J. Mach. Learn. Res..

[173]  Ron Kohavi,et al.  The Power of Decision Tables , 1995, ECML.

[174]  Haibin Liu,et al.  Combined mining of Web server logs and web contents for classifying user navigation patterns and predicting users' future requests , 2007, Data Knowl. Eng..

[175]  Ron Kohavi,et al.  Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology , 1995, KDD.

[176]  H. Benbrahim,et al.  A comparative study of pruned decision trees and fuzzy decision trees , 2000, PeachFuzz 2000. 19th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.00TH8500).

[177]  Fabrizio Angiulli,et al.  Distributed Nearest Neighbor-Based Condensation of Very Large Data Sets , 2007, IEEE Transactions on Knowledge and Data Engineering.

[178]  Xiaoming Xu,et al.  A hybrid genetic algorithm for feature selection wrapper based on mutual information , 2007, Pattern Recognit. Lett..

[179]  Robert D. Macredie,et al.  Navigation in hypermedia learning systems: experts vs. novices , 2006, Comput. Hum. Behav..