A Random Forest Model Building Using A priori Information for Diagnosis

The problem of inductive model building on precedents for biomedical applications is considered. The model paradigm is a random forest as a set of decision tree classifiers working as ensemble. The apriori information taken from training data set is used in proposed method of random forest model building provide more accurate model saving general random character of a method. The resulting random forest provide more accurate model in comparison with a single decision tree, but its comparison with known methods of random forest model building proposed method is more accurate.

[1]  Laurent Amsaleg,et al.  Locality sensitive hashing: A comparison of hash function types and querying mechanisms , 2010, Pattern Recognit. Lett..

[2]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  M. M. Hoffman,et al.  Classification and interaction in random forests , 2018, Proceedings of the National Academy of Sciences.

[4]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[5]  Hyunjoong Kim,et al.  Classification Trees With Unbiased Multiway Splits , 2001 .

[6]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[7]  John Mingers,et al.  An empirical comparison of selection measures for decision-tree induction , 2004, Machine Learning.

[8]  J. Ross Quinlan,et al.  Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .

[9]  Yali Amit,et al.  Joint Induction of Shape Features and Tree Classifiers , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  John Mingers,et al.  An Empirical Comparison of Pruning Methods for Decision Tree Induction , 1989, Machine Learning.

[11]  Wei Zhong Liu,et al.  Bias in information-based measures in decision tree induction , 1994, Machine Learning.

[12]  Andrii Oliinyk,et al.  Development of the indicator set of the features informativeness estimation for recognition and diagnostic model synthesis , 2018, 2018 14th International Conference on Advanced Trends in Radioelecrtronics, Telecommunications and Computer Engineering (TCSET).

[13]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  S. Subbotin,et al.  Entropy Based Evolutionary Search for Feature Selection , 2007, 2007 9th International Conference - The Experience of Designing and Applications of CAD Systems in Microelectronics.

[16]  Mandy Eberhart,et al.  Decision Forests For Computer Vision And Medical Image Analysis , 2016 .

[17]  Thomas Lengauer,et al.  Permutation importance: a corrected feature importance measure , 2010, Bioinform..

[18]  Saso Dzeroski,et al.  Combining Bagging and Random Subspaces to Create Better Ensembles , 2007, IDA.

[19]  Sergei A. Subbotin Methods of sampling based on exhaustive and evolutionary search , 2013, Automatic Control and Computer Sciences.

[20]  Sergey Subbotin,et al.  The Regression Tree Model Building Based on a Cluster-Regression Approximation for Data-Driven Medicine , 2018, IDDM.

[21]  Allan P. White,et al.  Technical Note: Bias in Information-Based Measures in Decision Tree Induction , 1994, Machine Learning.

[22]  Paul E. Utgoff,et al.  Incremental Induction of Decision Trees , 1989, Machine Learning.

[23]  A. A. Oliinyk,et al.  The decision tree construction based on a stochastic search for the neuro-fuzzy network synthesis , 2015, Optical Memory and Neural Networks.

[24]  Vincent Botta,et al.  A walk into random forests: adaptation and application to Genome-Wide Association Studies , 2013 .

[25]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[26]  Pierre Geurts,et al.  Supervised learning with decision tree-based methods in computational and systems biology. , 2009, Molecular bioSystems.

[27]  Saharon Rosset,et al.  Cross-Validated Variable Selection in Tree-Based Methods Improves Predictive Performance. , 2016, IEEE transactions on pattern analysis and machine intelligence.

[28]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[29]  Sergey Subbotin Quasi-Relief Method of Informative Features Selection for Classification , 2018, 2018 IEEE 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT).

[30]  Carolin Strobl,et al.  Unbiased split selection for classification trees based on the Gini Index , 2007, Comput. Stat. Data Anal..

[31]  Pierre Geurts,et al.  Contributions to decision tree induction: bias/variance tradeoff and time series classification , 2002 .

[32]  David H. Wolpert,et al.  An Efficient Method To Estimate Bagging's Generalization Error , 1999, Machine Learning.

[33]  Andrii Oliinyk,et al.  Feature Selection Based on Parallel Stochastic Computing , 2018, 2018 IEEE 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT).

[34]  L. Wehenkel On uncertainty measures used for decision tree induction , 1996 .

[35]  Anne-Laure Boulesteix,et al.  Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics , 2012, WIREs Data Mining Knowl. Discov..

[36]  Chris Carter,et al.  Multiple decision trees , 2013, UAI.

[37]  Richard Kufrin,et al.  Decision trees on parallel processors , 1997, Parallel Processing for Artificial Intelligence 3.

[38]  Ramón López de Mántaras,et al.  A distance-based attribute selection measure for decision tree induction , 1991, Machine Learning.

[39]  Sergey Subbotin,et al.  Classification by fuzzy decision trees inducted based on Cumulative Mutual Information , 2018, 2018 14th International Conference on Advanced Trends in Radioelecrtronics, Telecommunications and Computer Engineering (TCSET).

[40]  Masahiro Miyakawa Criteria for Selecting a Variable in the Construction of Efficient Decision Trees , 1989, IEEE Trans. Computers.

[41]  Jerome H. Friedman,et al.  A Recursive Partitioning Decision Rule for Nonparametric Classification , 1977, IEEE Transactions on Computers.

[42]  Thomas G. Dietterich,et al.  Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms , 2008 .

[43]  K. Hornik,et al.  Unbiased Recursive Partitioning: A Conditional Inference Framework , 2006 .

[44]  Gilles Louppe,et al.  Understanding Random Forests: From Theory to Practice , 2014, 1407.7502.

[45]  Steven L. Salzberg,et al.  On growing better decision trees from data , 1996 .

[46]  Sergey Subbotin,et al.  The Instance and Feature Selection for Neural Network Based Diagnosis of Chronic Obstructive Bronchitis , 2015, Applications of Computational Intelligence in Biomedical Technology.