Bias in random forest variable importance measures: Illustrations, sources and a solution

(2004). Few amino acid positions in rpoB are associated with most of the rifampin resistance in Mycobacterium tuberculosis. Development of linear, ensemble, and nonlinear models for the prediction and interpretation of the biological activity of a set of PDGFR inhibitors.

[1]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[2]  Johannes Gehrke,et al.  Bias Correction in Classification Tree Construction , 2001, ICML.

[3]  Hyunjoong Kim,et al.  Classification Trees With Unbiased Multiway Splits , 2001 .

[4]  Cesare Furlanello,et al.  GIS and the Random Forest Predictor: Integration in R for Tick-Borne Disease Risk Assessment , 2003 .

[5]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[6]  A. Luczak,et al.  Estimating neuronal variable importance with Random Forest , 2003, 2003 IEEE 29th Annual Proceedings of Bioengineering Conference.

[7]  Rajarshi Guha,et al.  Development of Linear, Ensemble, and Nonlinear Models for the Prediction and Interpretation of the Biological Activity of a Set of PDGFR Inhibitors , 2004, J. Chem. Inf. Model..

[8]  Mark R. Segal,et al.  Few amino acid positions in rpoB are associated with most of the rifampin resistance in Mycobacterium tuberculosis , 2004, BMC Bioinformatics.

[9]  K. Lunetta,et al.  Screening large-scale association study data: exploiting interactions using random forests , 2004, BMC Genetics.

[10]  M. Segal,et al.  Relating HIV-1 Sequence Variation to Replication Capacity via Trees and Forests , 2004, Statistical applications in genetics and molecular biology.

[11]  Wei Pan,et al.  A comparative study of discriminating human heart failure etiology using gene expression profiles , 2005, BMC Bioinformatics.

[12]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[13]  K. Lunetta,et al.  Identifying SNPs predictive of phenotype using random forests , 2005, Genetic epidemiology.

[14]  M. J. Laan Statistical Inference for Variable Importance , 2006 .

[15]  K. Hornik,et al.  Unbiased Recursive Partitioning: A Conditional Inference Framework , 2006 .

[16]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[17]  Christopher James Langmead,et al.  Structure-Based Chemical Shift Prediction Using Random Forests Non-Linear Regression , 2005, APBC.

[18]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[19]  Sinisa Pajevic,et al.  Short-term prediction of mortality in patients with systemic lupus erythematosus: classification of outcomes using random forests. , 2006, Arthritis and rheumatism.

[20]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[21]  Carolin Strobl,et al.  Unbiased split selection for classification trees based on the Gini Index , 2007, Comput. Stat. Data Anal..