Outside the Machine Learning Blackbox: Supporting Analysts Before and After the Learning Algorithm
暂无分享,去创建一个
[1] Donato Malerba,et al. A Comparative Analysis of Methods for Pruning Decision Trees , 1997, IEEE Trans. Pattern Anal. Mach. Intell..
[2] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .
[3] Iain E. Buchan,et al. A unified modeling approach to data-intensive healthcare , 2009, The Fourth Paradigm.
[4] K. Abazajian,et al. THE SEVENTH DATA RELEASE OF THE SLOAN DIGITAL SKY SURVEY , 2008, 0812.0649.
[5] A. Townsend Peterson,et al. Novel methods improve prediction of species' distributions from occurrence data , 2006 .
[6] Thomas Hofmann,et al. Learning to Rank with Nonsmooth Cost Functions , 2006, NIPS.
[7] N. Draper,et al. Applied Regression Analysis. , 1967 .
[8] Tim Oates,et al. The Effects of Training Set Size on Decision Tree Complexity , 1997, ICML.
[9] M. Knutson,et al. Scaling Local Species-habitat Relations to the Larger Landscape with a Hierarchical Spatial Count Model , 2007, Landscape Ecology.
[10] K. Pollock,et al. EXPERIMENTAL ANALYSIS OF THE AUDITORY DETECTION PROCESS ON AVIAN POINT COUNTS , 2007 .
[11] Yiming Yang,et al. A study of thresholding strategies for text categorization , 2001, SIGIR '01.
[12] Rich Caruana,et al. An empirical evaluation of supervised learning in high dimensions , 2008, ICML '08.
[13] Pedro M. Domingos. A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.
[14] Michael J. Pazzani,et al. Reducing Misclassification Costs , 1994, ICML.
[15] Remco R. Bouckaert. Practical Bias Variance Decomposition , 2008, Australasian Conference on Artificial Intelligence.
[16] D. Fink,et al. Spatiotemporal exploratory models for broad-scale survey data. , 2010, Ecological applications : a publication of the Ecological Society of America.
[17] J. Michael Scott,et al. Predicting Species Occurrences: Issues of Accuracy and Scale , 2002 .
[18] Charles Elkan,et al. The Foundations of Cost-Sensitive Learning , 2001, IJCAI.
[19] Yoram Singer,et al. An Efficient Boosting Algorithm for Combining Preferences by , 2013 .
[20] Thomas G. Dietterich. An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.
[21] Aaron M Ellison,et al. Observer bias and the detection of low-density populations. , 2009, Ecological applications : a publication of the Ecological Society of America.
[22] Wynne Hsu,et al. Intuitive Representation of Decision Trees Using General Rules and Exceptions , 2000, AAAI/IAAI.
[23] J. Hanley,et al. The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.
[24] Wesley M. Hochachka,et al. Sources of Variation in Singing Probability of Florida Grasshopper Sparrows, and Implications for Design and Analysis of Auditory Surveys , 2009 .
[25] Thomas G. Dietterich,et al. Error-Correcting Output Coding Corrects Bias and Variance , 1995, ICML.
[26] Maarten van Someren,et al. A Bias-Variance Analysis of a Real World Learning Problem: The CoIL Challenge 2000 , 2004, Machine Learning.
[27] Courtney J. Conway,et al. Progress toward developing field protocols for a North American marsh bird monitoring program , 2005 .
[28] Rich Caruana,et al. Greedy Attribute Selection , 1994, ICML.
[29] C. S. Robbins,et al. The Breeding Bird Survey: Its First Fifteen Years, 1965-1979 , 1987 .
[30] Eric Bauer,et al. An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.
[31] Hwee Tou Ng,et al. A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.
[32] Luis M. Carrascal,et al. BIAS IN AVIAN SAMPLING EFFORT DUE TO HUMAN PREFERENCES: AN ANALYSIS WITH CATALONIAN BIRDS (1900 - 2002) , 2006 .
[33] Rich Caruana,et al. Model compression , 2006, KDD '06.
[34] Salvatore J. Stolfo,et al. AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.
[35] Steve R. Gunn,et al. Result Analysis of the NIPS 2003 Feature Selection Challenge , 2004, NIPS.
[36] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .
[37] Catherine S. Jarnevich,et al. Ensemble Habitat Mapping of Invasive Plant Species , 2010, Risk analysis : an official publication of the Society for Risk Analysis.
[38] Steve Kelling,et al. Data-Intensive Science: A New Paradigm for Biodiversity Studies , 2009 .
[39] Carolina Tovar,et al. Using Spatial Models to Predict Areas of Endemism and Gaps in the Protection of Andean Slope Birds , 2009 .
[40] Wray L. Buntine,et al. Learning classification trees , 1992 .
[41] C. Thomas,et al. Birds extend their ranges northwards , 1999, Nature.
[42] Stan Matwin,et al. Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.
[43] Francis K. H. Quek,et al. Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets , 2003, Pattern Recognit..
[44] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[45] Rich Caruana,et al. Ensemble selection from libraries of models , 2004, ICML.
[46] Pinar Donmez,et al. On the local optimality of LambdaRank , 2009, SIGIR.
[47] John Langford,et al. An iterative method for multi-class cost-sensitive learning , 2004, KDD.
[48] Walter Daelemans,et al. TiMBL: Tilburg Memory-Based Learner, version 2.0, Reference guide , 1998 .
[49] John Loughrey,et al. Using Early-Stopping to Avoid Overfitting in Wrapper-Based Feature Selection Employing Stochastic Search , 2005 .
[50] Peter J. Blancher,et al. Setting numerical population objectives for priority landbird species , 2005 .
[51] Rich Caruana,et al. Introduction to IND and recursive partitioning, version 1.0 , 1991 .
[52] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[53] Bernd Markert,et al. Biomonitoring with birds. , 2003 .
[54] Walter Daelemans,et al. Evaluation of Machine Learning Methods for Natural Language Processing Tasks , 2002, LREC.
[55] Ivanoe De Falco,et al. An evolutionary approach for automatically extracting intelligible classification rules , 2005, Knowledge and Information Systems.
[56] Wray L. Buntine,et al. A Further Comparison of Splitting Rules for Decision-Tree Induction , 1992, Machine Learning.
[57] Masoud Nikravesh,et al. Feature Extraction - Foundations and Applications , 2006, Feature Extraction.
[58] David B. Roy,et al. A northward shift of range margins in British Odonata , 2005 .
[59] J. Bart,et al. Reliability of Singing Bird Surveys: Changes in Observer Efficiency with Avian Density , 1984 .
[60] Filip Radlinski,et al. A support vector method for optimizing average precision , 2007, SIGIR.
[61] Lynette Hirschman,et al. A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.
[62] Leo Breiman,et al. Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.
[63] Susan Ratcliffe,et al. The Oxford dictionary of quotations by subject , 2010 .
[64] Thomas G. Dietterich,et al. Pruning Adaptive Boosting , 1997, ICML.
[65] Ben Shneiderman,et al. The healthcare singularity and the age of semantic medicine , 2009, The Fourth Paradigm.
[66] J. Langford,et al. FeatureBoost: A Meta-Learning Algorithm that Improves Model Robustness , 2000, ICML.
[67] Bhavani Raskutti,et al. Optimising area under the ROC curve using gradient descent , 2004, ICML.
[68] W. Thuiller,et al. Predicting species distribution: offering more than simple habitat models. , 2005, Ecology letters.
[69] Jude W. Shavlik,et al. in Advances in Neural Information Processing , 1996 .
[70] Gareth M. James,et al. Generalizations of the Bias/Variance Decomposition for Prediction Error , 1997 .
[71] Dale Schuurmans,et al. Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.
[72] WESLEY M. HOCHACHKA,et al. Data-Mining Discovery of Pattern and Process in Ecological Systems , 2007 .
[73] Denis Couvet,et al. Thermal range predicts bird population resilience to extreme high temperatures. , 2006, Ecology letters.
[74] Bernd Markert,et al. Chapter 1 Definitions, strategies and principles for bioindication/biomonitoring of the environment , 2003 .
[75] Veronique Hoste,et al. Optimization issues in machine learning of coreference resolution , 2005 .
[76] Kai Ming Ting,et al. Inducing Cost-Sensitive Trees via Instance Weighting , 1998, PKDD.
[77] Tom Fawcett,et al. An introduction to ROC analysis , 2006, Pattern Recognit. Lett..
[78] Thore Graepel,et al. Large Margin Rank Boundaries for Ordinal Regression , 2000 .
[79] Kai Ming Ting,et al. Boosting Trees for Cost-Sensitive Classifications , 1998, ECML.
[80] Miroslav Dudík,et al. Maximum Entropy Density Estimation with Generalized Regularization and an Application to Species Distribution Modeling , 2007, J. Mach. Learn. Res..
[81] Michael J. Pazzani,et al. Knowledge discovery from data? , 2000, IEEE Intell. Syst..
[82] R. Real,et al. AUC: a misleading measure of the performance of predictive distribution models , 2008 .
[83] Claire Cardie,et al. Recognizing and Organizing Opinions Expressed in the World Press , 2003, New Directions in Question Answering.
[84] Claire Cardie,et al. Improving Machine Learning Approaches to Noun Phrase Coreference Resolution , 2004 .
[85] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.
[86] Robert Tibshirani,et al. Bias, Variance and Prediction Error for Classification Rules , 1996 .
[87] Eugene Tuv,et al. Feature Selection Using Ensemble Based Ranking Against Artificial Contrasts , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.
[88] Juha Reunanen,et al. Overfitting in Making Comparisons Between Variable Selection Methods , 2003, J. Mach. Learn. Res..
[89] Pedro M. Domingos. A Unifeid Bias-Variance Decomposition and its Applications , 2000, ICML.
[90] Bianca Zadrozny,et al. Learning and making decisions when costs and probabilities are both unknown , 2001, KDD '01.
[91] J. Nichols,et al. Monitoring for conservation. , 2006, Trends in ecology & evolution.
[92] Catharine van Ingen,et al. Redefining ecological science using data , 2009, The Fourth Paradigm.
[93] Isabelle Guyon,et al. Winning the KDD Cup Orange Challenge with Ensemble Selection , 2009 .
[94] C. S. Wallace,et al. Coding Decision Trees , 1993, Machine Learning.
[95] Walter Daelemans,et al. Parameter optimization for machine-learning of word sense disambiguation , 2002, Natural Language Engineering.
[96] Falk Huettmann,et al. Current State of the Art for Statistical Modelling of Species Distributions , 2010 .
[97] R. Stolzenberg,et al. Multiple Regression Analysis , 2004 .
[98] Pedro M. Domingos. Knowledge Discovery Via Multiple Models , 1998, Intell. Data Anal..
[99] Rich Caruana,et al. Benefitting from the Variables that Variable Selection Discards , 2003, J. Mach. Learn. Res..
[100] Bogdan E. Popescu,et al. PREDICTIVE LEARNING VIA RULE ENSEMBLES , 2008, 0811.1679.
[101] Tom Bylander,et al. Estimating Generalization Error on Two-Class Datasets Using Out-of-Bag Estimates , 2002, Machine Learning.
[102] John Platt,et al. Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .
[103] W. Koenig,et al. SPATIAL AUTOCORRELATION AND LOCAL DISAPPEARANCES IN WINTERING NORTH AMERICAN BIRDS , 2001 .
[104] R. Bonney,et al. Citizen Science: A Developing Tool for Expanding Science Knowledge and Scientific Literacy , 2009 .
[105] P. Daszak,et al. Predicting the global spread of H5N1 avian influenza , 2006, Proceedings of the National Academy of Sciences.
[106] Yvan Saeys,et al. New challenges for feature selection in data mining and knowledge discovery , 2008 .
[107] Jason Weston,et al. Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.
[108] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .
[109] G. Hooker. Generalized Functional ANOVA Diagnostics for High-Dimensional Functions of Dependent Variables , 2007 .
[110] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.
[111] Rich Caruana,et al. Data mining in metric space: an empirical analysis of supervised learning performance criteria , 2004, ROCAI.
[112] Niklaus E. Zimmermann,et al. Predicting tree species presence and basal area in Utah: A comparison of stochastic gradient boosting, generalized additive models, and tree-based methods , 2006 .
[113] Jerome H. Friedman,et al. On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.
[114] J. Heckman. Sample selection bias as a specification error , 1979 .
[115] G. De’ath,et al. CLASSIFICATION AND REGRESSION TREES: A POWERFUL YET SIMPLE TECHNIQUE FOR ECOLOGICAL DATA ANALYSIS , 2000 .
[116] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.
[117] Ron Kohavi,et al. Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.
[118] D. McClish. Analyzing a Portion of the ROC Curve , 1989, Medical decision making : an international journal of the Society for Medical Decision Making.
[119] Thorsten Joachims,et al. A support vector method for multivariate performance measures , 2005, ICML.
[120] David D. Lewis,et al. Applying Support Vector Machines to the TREC-2001 Batch Filtering and Routing Tasks , 2001, TREC.
[121] Igor Kononenko,et al. Cost-Sensitive Learning with Neural Networks , 1998, ECAI.
[122] Roger Sauter,et al. Introduction to Probability and Statistics for Engineers and Scientists , 2005, Technometrics.
[123] M. Fireman,et al. MULTIPLE REGRESSION ANALYSIS OF SOIL DATA , 1954 .
[124] S. Manel,et al. Evaluating presence-absence models in ecology: the need to account for prevalence , 2001 .
[125] Michael J. Pazzani,et al. Error reduction through learning multiple descriptions , 2004, Machine Learning.
[126] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..
[127] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .
[128] C. Marshall. Encyclopedia of Life , 2008 .
[129] B. V. Horne,et al. DENSITY AS A MISLEADING INDICATOR OF HABITAT QUALITY , 1983 .
[130] Larry A. Rendell,et al. The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.
[131] P. van der Putten,et al. A Bias-Variance Analysis of a Real World Learning Problem: The CoIL Challenge 2000 , 2004 .
[132] Yvan Saeys,et al. Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.
[133] Stephen R. Baillie,et al. Migration Watch: an Internet survey to monitor spring migration in Britain and Ireland , 2006, Journal of Ornithology.
[134] A. Townsend Peterson,et al. Rethinking receiver operating characteristic analysis applications in ecological niche modeling , 2008 .
[135] W. Hochachka,et al. Density-dependent decline of host abundance resulting from a new infectious disease. , 2000, Proceedings of the National Academy of Sciences of the United States of America.
[136] Pat Langley,et al. Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..
[137] N. Gotelli. Predicting Species Occurrences: Issues of Accuracy and Scale , 2003 .
[138] Aiko M. Hormann,et al. Programs for Machine Learning. Part I , 1962, Inf. Control..
[139] M. Pazzani. Influence of prior knowledge on concept acquisition: Experimental and computational results. , 1991 .
[140] John Bell,et al. A review of methods for the assessment of prediction errors in conservation presence/absence models , 1997, Environmental Conservation.
[141] Cândida Ferreira,et al. Gene Expression Programming: A New Adaptive Algorithm for Solving Problems , 2001, Complex Syst..
[142] Michael C. Mozer,et al. Optimizing Classifier Performance via an Approximation to the Wilcoxon-Mann-Whitney Statistic , 2003, ICML.
[143] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[144] Tom M. Mitchell,et al. Using the Future to Sort Out the Present: Rankprop and Multitask Learning for Medical Risk Evaluation , 1995, NIPS.
[145] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.
[146] Claire Gardent,et al. Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.
[147] David W. Opitz,et al. Feature Selection for Ensembles , 1999, AAAI/IAAI.
[148] G. J. Niemi,et al. A comparison of on- and off-road bird counts: Do you need to go off road to count birds accurately? , 1995 .
[149] John Mingers,et al. An empirical comparison of selection measures for decision-tree induction , 2004, Machine Learning.
[150] Stephen D. Bay. Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets , 1998, ICML.
[151] Yi Lin,et al. Support Vector Machines for Classification in Nonstandard Situations , 2002, Machine Learning.
[152] W. Kendall,et al. First-Time Observer Effects in the North American Breeding Bird Survey , 1996 .
[153] Rich Caruana,et al. Predicting good probabilities with supervised learning , 2005, ICML.
[154] Steve Kelling,et al. Mining citizen science data to predict orevalence of wild bird species , 2006, KDD '06.
[155] Carla E. Brodley,et al. Pruning Decision Trees with Misclassification Costs , 1998, ECML.
[156] John Mingers,et al. An Empirical Comparison of Pruning Methods for Decision Tree Induction , 1989, Machine Learning.
[157] Thorsten Joachims,et al. Making large scale SVM learning practical , 1998 .
[158] Claire Cardie,et al. Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions , 2004, COLING.
[159] H. Zou,et al. Regularization and variable selection via the elastic net , 2005 .
[160] Simon Ferrier,et al. Evaluating the predictive performance of habitat models developed using logistic regression , 2000 .
[161] Les G. Underhill,et al. The seminal legacy of the Southern African Bird Atlas Project , 2008 .
[162] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.
[163] D. Billman. Structural Biases in Concept Learning: Influences from Multiple Functions , 1996 .
[164] John Langford,et al. Cost-sensitive learning by cost-proportionate example weighting , 2003, Third IEEE International Conference on Data Mining.
[165] Alain Rakotomamonjy,et al. Optimizing Area Under Roc Curve with SVMs , 2004, ROCAI.
[166] Rich Caruana,et al. An empirical comparison of supervised learning algorithms , 2006, ICML.
[167] Sunil J Rao,et al. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2003 .
[168] Curtis Wong,et al. Bringing the night sky closer: discoveries in the data deluge , 2009, The Fourth Paradigm.
[169] D. MacKenzie. Modeling the Probability of Resource Use: The Effect of, and Dealing with, Detecting a Species Imperfectly , 2006 .
[170] William W. Cohen. Fast Effective Rule Induction , 1995, ICML.
[171] Michael J. Pazzani,et al. Beyond Concise and Colorful: Learning Intelligible Rules , 1997, KDD.
[172] Heikki Mannila,et al. Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.
[173] Tony Hey,et al. The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .
[174] Peter D. Turney. Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..
[175] Jonathan Bart,et al. Reliability of the Breeding Bird Survey: Effects of restricting surveys to roads , 1995 .
[176] Brian L. Sullivan,et al. eBird: A citizen-based bird observation network in the biological sciences , 2009 .
[177] W. Link,et al. Observer differences in the North American Breeding Bird Survey , 1994 .
[178] George C. Runger,et al. Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination , 2009, J. Mach. Learn. Res..
[179] Pedro M. Domingos. MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.
[180] Peter A. Flach,et al. Learning Decision Trees Using the Area Under the ROC Curve , 2002, ICML.
[181] Lori E. Dodd,et al. Partial AUC Estimation and Regression , 2003, Biometrics.
[182] Rich Caruana,et al. Getting the Most Out of Ensemble Selection , 2006, Sixth International Conference on Data Mining (ICDM'06).
[183] Nathan Intrator,et al. Interpreting neural-network results: a simulation study , 2001 .
[184] Ron Kohavi,et al. Wrappers for Feature Subset Selection , 1997, Artif. Intell..
[185] D. Bystrak,et al. The role of observer bias in the North American Breeding Bird Survey , 1981 .
[186] Tin Kam Ho,et al. The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..
[187] Leo Breiman,et al. Classification and Regression Trees , 1984 .
[188] Nikunj C. Oza,et al. Online Ensemble Learning , 2000, AAAI/IAAI.