vtreat: a data.frame Processor for Predictive Modeling
暂无分享,去创建一个
Nina Zumel | John Mount | J. Mount | N. Zumel
[1] R. Cody. Cody's Data Cleaning Techniques Using SAS , 2015 .
[2] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .
[3] Daniele Micci-Barreca,et al. A preprocessing scheme for high-cardinality categorical attributes in classification and prediction problems , 2001, SKDD.
[4] Alan C Elliott,et al. Preparing Data for Analysis Using Microsoft Excel , 2006, Journal of Investigative Medicine.
[5] Nina Zumel,et al. vtreat: A Statistically Sound 'data.frame' Processor/Conditioner , 2015 .
[6] Bernd Bischl,et al. mlr: Machine Learning in R , 2016, J. Mach. Learn. Res..
[7] Ralph Kimball,et al. The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data , 2004 .
[8] R Core Team,et al. R: A language and environment for statistical computing. , 2014 .
[9] Dorian Pyle,et al. Data Preparation for Data Mining , 1999 .
[10] S. Geer,et al. General oracle inequalities for model selection , 2009 .
[11] M. J. van der Laan,et al. Statistical Applications in Genetics and Molecular Biology Super Learner , 2010 .
[12] Grzegorz Swirszcz,et al. On cross-validation and stacking: building seemingly predictive models on random data , 2011, SKDD.
[13] Theodore Johnson,et al. Exploratory Data Mining and Data Cleaning , 2003 .
[14] Nina Zumel,et al. Practical Data Science with R , 2014 .
[15] Robert E. Sweeney,et al. A Transformation for Simplifying the Interpretation of Coefficients of Binary Variables in Regression Analysis , 1972 .
[16] D. Freedman. A Note on Screening Regression Equations , 1983 .
[17] Gary King,et al. Amelia II: A Program for Missing Data , 2011 .
[18] Jean-Michel Poggi,et al. Variable selection using random forests , 2010, Pattern Recognit. Lett..
[19] Stef van Buuren,et al. MICE: Multivariate Imputation by Chained Equations in R , 2011 .
[20] J. Tukey. The Future of Data Analysis , 1962 .