Machine Learning Based on Attribute Interactions
暂无分享,去创建一个
[1] G. Faè,et al. The physical review , 1895 .
[2] R. Fisher. 001: On an Absolute Criterion for Fitting Frequency Curves. , 1912 .
[3] R. Fisher. On the Interpretation of χ2 from Contingency Tables, and the Calculation of P , 2018, Journal of the Royal Statistical Society Series A (Statistics in Society).
[4] M. Bartlett. Contingency Table Interactions , 1935 .
[5] R. Fisher. THE FIDUCIAL ARGUMENT IN STATISTICAL INFERENCE , 1935 .
[6] R. Tolman,et al. The Principles of Statistical Mechanics. By R. C. Tolman. Pp. xix, 661. 40s. 1938. International series of monographs on physics. (Oxford) , 1939, The Mathematical Gazette.
[7] J. Kirkwood,et al. The Radial Distribution Function in Liquids , 1942 .
[8] G. Brier. VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .
[9] R. Kikuchi. A Theory of Cooperative Phenomena , 1951 .
[10] R. A. Leibler,et al. On Information and Sufficiency , 1951 .
[11] William J. McGill. Multivariate information transmission , 1954, Trans. IRE Prof. Group Inf. Theory.
[12] Ga Miller,et al. Note on the bias of information estimates , 1955 .
[13] Marvin A. Kastenbaum,et al. On the Hypothesis of No "Interaction" In a Multi-way Contingency Table , 1956 .
[14] E. Jaynes. Information Theory and Statistical Mechanics , 1957 .
[15] Michael Satosi Watanabe,et al. Information Theoretical Analysis of Multivariate Correlation , 1960, IBM J. Res. Dev..
[16] C. Rajski,et al. A Metric Space of Discrete Probability Distributions , 1961, Inf. Control..
[17] G. A. Barnard,et al. Transmission of Information: A Statistical Theory of Communications. , 1961 .
[18] J. Morgan,et al. Problems in the Analysis of Survey Data, and a Proposal , 1963 .
[19] Irving John Good,et al. The Estimation of Probabilities: An Essay on Modern Bayesian Methods , 1965 .
[20] Lotfi A. Zadeh,et al. Fuzzy Sets , 1996, Inf. Control..
[21] S. Kullback. Probability Densities with Given Marginals , 1968 .
[22] C. N. Liu,et al. Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.
[23] H. O. Lancaster. The chi-squared distribution , 1971 .
[24] W. Harris,et al. Traumatic arthritis of the hip after dislocation and acetabular fractures: treatment by mold arthroplasty. An end-result study using a new method of result evaluation. , 1969, The Journal of bone and joint surgery. American volume.
[25] M. Hinich,et al. An Expository Development of a Mathematical Model of the Electoral Process , 1970, American Political Science Review.
[26] J. J. Freeman. Note on approximating discrete probability distributions (Corresp.) , 1971, IEEE Trans. Inf. Theory.
[27] J. Darroch,et al. Generalized Iterative Scaling for Log-Linear Models , 1972 .
[28] A. H. Murphy. A New Vector Partition of the Probability Score , 1973 .
[29] J. Darroch. Multiplicative and additive interaction in contingency tables , 1974 .
[30] M. Stone. Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .
[31] Te Sun Han,et al. Linear Dependence Structure of the Entropy Space , 1975, Inf. Control..
[32] Andrew K. C. Wong,et al. Typicality, Diversity, and Feature Pattern of an Ensemble , 1975, IEEE Transactions on Computers.
[33] M. Degroot,et al. Probability and Statistics , 2021, Examining an Operational Approach to Teaching Probability.
[34] A. N. Tikhonov,et al. Solutions of ill-posed problems , 1977 .
[35] Te Sun Han. Nonnegative Entropy Measures of Multivariate Symmetric Correlations , 1978, Inf. Control..
[36] T. Speed,et al. Markov Fields and Log-Linear Interaction Models for Contingency Tables , 1980 .
[37] Te Sun Han,et al. Multiple Mutual Informations and Multiple Interactions in Frequency Data , 1980, Inf. Control..
[38] D. Rubin. The Bayesian Bootstrap , 1981 .
[39] I. Good. Good Thinking: The Foundations of Probability and Its Applications , 1983 .
[40] B. Efron,et al. A Leisurely Look at the Bootstrap, the Jackknife, and , 1983 .
[41] Leslie G. Valiant,et al. A theory of the learnable , 1984, CACM.
[42] A. P. Dawid,et al. Present position and potential developments: some personal views , 1984 .
[43] S. Salthe. Evolving Hierarchical Systems: Their Structure and Representation , 1985 .
[44] David L. Waltz,et al. Toward memory-based reasoning , 1986, CACM.
[45] J. Rissanen. Stochastic Complexity and Modeling , 1986 .
[46] D. A. Kenny,et al. The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. , 1986, Journal of personality and social psychology.
[47] Klaus Krippendorff,et al. Information Theory: Structural Models for Qualitative Data. , 1988 .
[48] Alen D. Shapiro,et al. Structured induction in expert systems , 1987 .
[49] Rubin Herman,et al. A WEAK SYSTEM OF AXIOMS FOR "RATIONAL" BEHAVIOR AND THE NONSEPARABILITY OF UTILITY FROM PRIOR , 1987 .
[50] Judea Pearl,et al. Probabilistic reasoning in intelligent systems , 1988 .
[51] J. N. Kapur. Maximum-entropy models in science and engineering , 1992 .
[52] Steven W. Norton. Generating Better Decision Trees , 1989, IJCAI.
[53] Bojan Cestnik,et al. Estimating Probabilities: A Crucial Task in Machine Learning , 1990, ECAI.
[54] Igor Kononenko,et al. Semi-Naive Bayesian Classifier , 1991, EWSL.
[55] A. Agresti,et al. Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.
[56] Wray L. Buntine. Classifiers: A Theoretical and Empirical Study , 1991, IJCAI.
[57] Thomas M. Cover,et al. Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .
[58] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[59] I. Csiszár. Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems , 1991 .
[60] Raymond W. Yeung,et al. A new outlook of Shannon's information measures , 1991, IEEE Trans. Inf. Theory.
[61] Donald Michie,et al. Use of sequential Bayes with class probability trees , 1991 .
[62] Ali S. Hadi,et al. Finding Groups in Data: An Introduction to Chster Analysis , 1991 .
[63] Larry A. Rendell,et al. A Practical Approach to Feature Selection , 1992, ML.
[64] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .
[65] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.
[66] David Haussler,et al. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..
[67] Larry A. Rendell,et al. Lookahead Feature Construction for Learning Hard Concepts , 1993, International Conference on Machine Learning.
[68] Usama M. Fayyad,et al. Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.
[69] Stanley N. Salthe,et al. Development and Evolution: Complexity and Change in Biology , 1993 .
[70] C. Judd,et al. Statistical difficulties of detecting interactions and moderator effects. , 1993, Psychological bulletin.
[71] T. Tsujishita,et al. On Triple Mutual Information , 1994 .
[72] David H. Wolpert,et al. Estimating Functions of Distributions from A Finite Set of Samples, Part 2: Bayes Estimators for Mutual Information, Chi-Squared, Covariance and other Statistics , 1994, comp-gas/9403002.
[73] Malcolm R. Forster,et al. How to Tell When Simpler, More Unified, or Less Ad Hoc Theories will Provide More Accurate Predictions , 1994, The British Journal for the Philosophy of Science.
[74] Pat Langley,et al. Induction of Selective Bayesian Classifiers , 1994, UAI.
[75] F. Kianifard. Applied Multivariate Data Analysis: Volume II: Categorical and Multivariate Methods , 1994 .
[76] Michael J. Pazzani,et al. Searching for Dependencies in Bayesian Classifiers , 1995, AISTATS.
[77] David R. Wolf,et al. Estimating functions of probability distributions from a finite set of samples. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.
[78] Ron Kohavi,et al. Wrappers for performance enhancement and oblivious decision graphs , 1995 .
[79] David B. Dunson,et al. Bayesian Data Analysis , 2010 .
[80] Carl M. Kadie,et al. SEER: maximum likelihood regression for learning-speed curves , 1995 .
[81] Christopher M. Bishop,et al. Current address: Microsoft Research, , 2022 .
[82] Y. Benjamini,et al. Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .
[83] Robert Tibshirani,et al. Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..
[84] Wray L. Buntine. A Guide to the Literature on Learning Probabilistic Networks from Data , 1996, IEEE Trans. Knowl. Data Eng..
[85] Bernhard Sendhoff,et al. How to Determine the Redundancy of Noisy Chaotic Time Series , 1996 .
[86] David Heckerman,et al. Knowledge Representation and Inference in Similarity Networks and Bayesian Multinets , 1996, Artif. Intell..
[87] Ron Kohavi,et al. Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.
[88] Daphne Koller,et al. Toward Optimal Feature Selection , 1996, ICML.
[89] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[90] P. Groenen,et al. Modern multidimensional scaling , 1996 .
[91] Adam L. Berger,et al. A Maximum Entropy Approach to Natural Language Processing , 1996, CL.
[92] Mia Hubert,et al. Integrating robust clustering techniques in S-PLUS , 1997 .
[93] Jonathan Baxter,et al. The Canonical Distortion Measure for Vector Quantization and Function Approximation , 1997, ICML.
[94] Paul M. B. Vitányi,et al. The miraculous universal distribution , 1997 .
[95] Larry A. Rendell,et al. Global Data Analysis and the Fragmentation Problem in Decision Tree Induction , 1997, ECML.
[96] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.
[97] Trevor J. Hastie,et al. Discriminative vs Informative Learning , 1997, KDD.
[98] Anthony C. Davison,et al. Bootstrap Methods and Their Application , 1998 .
[99] N. J. Cerf,et al. Entropic Bell inequalities , 1997 .
[100] Eduardo Perez. Learning despite complex attribute interaction: an approach based on relational operators , 1997 .
[101] I. Csiszár. Information theoretic methods in probability and statistics , 1997, Proceedings of IEEE International Symposium on Information Theory.
[102] John D. Lafferty,et al. Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..
[103] Andrew McCallum,et al. A comparison of event models for naive bayes text classification , 1998, AAAI 1998.
[104] P. Grünwald. The Minimum Description Length Principle and Reasoning under Uncertainty , 1998 .
[105] Dunja Mladenic,et al. Machine Learning on non-homogeneous, distributed text data , 1998 .
[106] H. Joe. Multivariate models and dependence concepts , 1998 .
[107] M. Studený,et al. The Multiinformation Function as a Tool for Measuring Stochastic Dependence , 1998, Learning in Graphical Models.
[108] Thorsten Joachims,et al. Making large scale SVM learning practical , 1998 .
[109] Ian H. Witten,et al. Using a Permutation Test for Attribute Selection in Decision Trees , 1998, ICML.
[110] Geoffrey E. Hinton,et al. A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.
[111] David R. Anderson,et al. Model Selection and Multimodel Inference , 2003 .
[112] David A. Bell,et al. Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..
[113] Huan Liu,et al. Fragmentation problem and automated feature construction , 1998, Proceedings Tenth IEEE International Conference on Tools with Artificial Intelligence (Cat. No.98CH36294).
[114] Alexander J. Smola,et al. Learning with kernels , 1998 .
[115] D. Margolis,et al. Validation of a melanoma prognostic model. , 1998, Archives of dermatology.
[116] Gene H. Golub,et al. Tikhonov Regularization and Total Least Squares , 1999, SIAM J. Matrix Anal. Appl..
[117] Gregory F. Cooper,et al. A Bayesian Network Classifier that Combines a Finite Mixture Model and a NaIve Bayes Model , 1999, UAI.
[118] B. Streitberg. Exploring interactions in high-dimensional tables: a bootstrap alternative to log-linear models , 1999 .
[119] Y. Benjamini,et al. Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics , 1999 .
[120] Michel Grabisch,et al. An axiomatic approach to the concept of interaction among players in cooperative games , 1999, Int. J. Game Theory.
[121] Martin Theus,et al. Visualizing Loglinear Models , 1999 .
[122] Charles R. Meyer,et al. Multi-variate Mutual Information for Registration , 1999, MICCAI.
[123] F. Mattt,et al. Conditional Independences among Four Random Variables Iii: Final Conclusion , 1999 .
[124] Adrian E. Raftery,et al. Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .
[125] Murray A. Jorgensen,et al. Theory & Methods: Mixture model clustering using the MULTIMIX program , 1999 .
[126] Ivan Bratko,et al. Learning by Discovering Concept Hierarchies , 1999, Artif. Intell..
[127] K. T. Poole,et al. Nonparametric Unfolding of Binary Choice Data , 2000, Political Analysis.
[128] Michael I. Jordan,et al. Learning with Mixtures of Trees , 2001, J. Mach. Learn. Res..
[129] J. Leeuw. Applications of Convex Analysis to Multidimensional Scaling , 2000 .
[130] Shun-ichi Amari,et al. Methods of information geometry , 2000 .
[131] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .
[132] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[133] G Tononi,et al. Theoretical neuroanatomy: relating anatomical and functional connectivity in graphs and cortical connection matrices. , 2000, Cerebral cortex.
[134] Tommi S. Jaakkola,et al. Tractable Bayesian learning of tree belief networks , 2000, Stat. Comput..
[135] Mark A. Hall,et al. Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.
[136] M. Escobar,et al. Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .
[137] W. Freeman,et al. Generalized Belief Propagation , 2000, NIPS.
[138] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[139] Samuel Kaski,et al. Metrics that learn relevance , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.
[140] Matsuda,et al. Physical nature of higher-order mutual information: intrinsic correlations and frustration , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.
[141] William Bialek,et al. Synergy in a Neural Code , 2000, Neural Computation.
[142] Stephen J. Roberts,et al. Maximum certainty data partitioning , 2000, Pattern Recognit..
[143] Geoffrey J. McLachlan,et al. Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.
[144] Rish,et al. An analysis of data characteristics that affect naive Bayes performance , 2001 .
[145] Gal Chechik,et al. Group Redundancy Measures Reveal Redundancy Reduction in the Auditory Pathway , 2001, NIPS.
[146] Nolan McCarty,et al. The Hunt for Party Discipline in Congress , 2001, American Political Science Review.
[147] Peter Harremoës,et al. Maximum Entropy Fundamentals , 2001, Entropy.
[148] Francesco M. Malvestuto,et al. An implementation of the iterative proportional fitting procedure by propagation trees , 2001 .
[149] Koby Crammer,et al. On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..
[150] J R Beck,et al. Predicting Patient’s Long-Term Clinical Status after Hip Arthroplasty Using Hierarchical Decision Modelling and Data Mining , 2001, Methods of Information in Medicine.
[151] Ariel Caticha. Maximum entropy, fluctuations and priors , 2001 .
[152] Stephen D. Bay. Multivariate Discretization for Set Mining , 2001, Knowledge and Information Systems.
[153] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[154] Aki Vehtari. Discussion to "Bayesian measures of model complexity and fit" by Spiegelhalter, D.J., Best, N.G., Carlin, B.P., and van der Linde, A. , 2002 .
[155] Naftali Tishby,et al. Unsupervised document classification using sequential information maximization , 2002, SIGIR '02.
[156] Wray L. Buntine. Variational Extensions to EM and Multinomial PCA , 2002, ECML.
[157] H. Belloc. The Free Press , 2002 .
[158] Charles X. Ling,et al. The Representational Power of Discrete Bayesian Networks , 2002, J. Mach. Learn. Res..
[159] Eamonn J. Keogh,et al. Learning the Structure of Augmented Bayesian Classifiers , 2002, Int. J. Artif. Intell. Tools.
[160] Aleks Jakulin. Attribute interactions in machine learning : master's thesis , 2002 .
[161] Raymond W. Yeung,et al. A First Course in Information Theory , 2002 .
[162] Henry Tirri,et al. B-Course: A Web-Based Tool for Bayesian and Causal Data Analysis , 2002, Int. J. Artif. Intell. Tools.
[163] Rob Malouf,et al. A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.
[164] X. Jin. Factor graphs and the Sum-Product Algorithm , 2002 .
[165] Rosa Meo. Maximum independence and mutual information , 2002, IEEE Trans. Inf. Theory.
[166] Henry Etzkowi,et al. The Triple Helix of University - Industry - Government , 2002 .
[167] William H. Press,et al. Numerical recipes in C , 2002 .
[168] Bradley P. Carlin,et al. Bayesian measures of model complexity and fit , 2002 .
[169] László Orlóci,et al. Biodiversity analysis: issues, concepts, techniques , 2002 .
[170] V. Vedral. The role of relative entropy in quantum information theory , 2001, quant-ph/0102094.
[171] H. Lynn. Suppression and Confounding in Action , 2003 .
[172] Matjaz Kukar,et al. Drifting Concepts as Hidden Factors in Clinical Studies , 2003, AIME.
[173] Nicolette de Keizer,et al. Integrating classification trees with local logistic regression in Intensive Care prognosis , 2003, Artif. Intell. Medicine.
[174] Thomas Wennekers,et al. Spatial and Temporal Stochastic Interaction in Neuronal Assemblies , 2003 .
[175] A. J. Bell. THE CO-INFORMATION LATTICE , 2003 .
[176] Ivan Bratko,et al. Attribute Interactions in Medical Data Analysis , 2003, AIME.
[177] Marina Meila,et al. Comparing Clusterings by the Variation of Information , 2003, COLT.
[178] Ramón López de Mántaras,et al. Tractable Bayesian Learning of Tree Augmented Naive Bayes Models , 2003, ICML.
[179] Interaktivna interakcijska analiza , 2003 .
[180] Ivan Kojadinovic,et al. Modeling interaction phenomena using fuzzy measures: on the notions of interaction and independence , 2003, Fuzzy Sets Syst..
[181] Aleks Jakulin,et al. Attribute Interactions in Machine Learning , 2003 .
[182] Ivan Bratko,et al. Analyzing Attribute Dependencies , 2003, PKDD.
[183] Eibe Frank,et al. Logistic Model Trees , 2003, ECML.
[184] John D. Storey. The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .
[185] Ricardo Vilalta,et al. A Decomposition of Classes via Clustering to Explain and Improve Naive Bayes , 2003, ECML.
[186] Michael J. Berry,et al. Network information and connected correlations. , 2003, Physical review letters.
[187] Ivan Bratko,et al. Quantifying and Visualizing Attribute Interactions: An Approach Based on Entropy , 2003 .
[188] Cynthia A. Brewer,et al. ColorBrewer in Print: A Catalog of Color Schemes for Maps , 2003 .
[189] Nello Cristianini,et al. Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..
[190] Contents , 2015, Neurobiology of Aging.
[191] Xiaojin Zhu,et al. Kernel conditional random fields: representation and clique selection , 2004, ICML.
[192] David W. Aha,et al. A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms , 1997, Artificial Intelligence Review.
[193] Bin Ma,et al. The similarity metric , 2001, IEEE Transactions on Information Theory.
[194] Samuel Kaski,et al. Sequential information bottleneck for finite data , 2004, ICML.
[195] Pedro Larrañaga,et al. Learning Recursive Bayesian Multinets for Data Clustering by Means of Constructive Induction , 2002, Machine Learning.
[196] Herbert K. H. Lee,et al. Lossless Online Bayesian Bagging , 2004, J. Mach. Learn. Res..
[197] Thomas D. Nielsen,et al. Latent variable discovery in classification models , 2004, Artif. Intell. Medicine.
[198] Vasant Honavar,et al. Generation of attribute value taxonomies from data for data-driven construction of accurate and compact classifiers , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).
[199] Shaul Markovitch,et al. Lookahead-based algorithms for anytime induction of decision trees , 2004, ICML.
[200] D. Ruppert. The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .
[201] Aleks Jakulin,et al. Applying Discrete PCA in Data Analysis , 2004, UAI.
[202] L. Leydesdorff,et al. The Triple Helix of university-industry-government relations , 2003, Scientometrics.
[203] Alex Alves Freitas,et al. Understanding the Crucial Role of Attribute Interaction in Data Mining , 2001, Artificial Intelligence Review.
[204] P. Kantor. Foundations of Statistical Natural Language Processing , 2001, Information Retrieval.
[205] Pedro M. Domingos,et al. Learning Bayesian network classifiers by maximizing conditional likelihood , 2004, ICML.
[206] Rich Caruana,et al. Ensemble selection from libraries of models , 2004, ICML.
[207] Pedro M. Domingos,et al. On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.
[208] David R. Brillinger,et al. Some data analyses using mutual information , 2004 .
[209] Hanna M. Wallach,et al. Conditional Random Fields: An Introduction , 2004 .
[210] Blaz Zupan,et al. Orange: From Experimental Machine Learning to Interactive Data Mining , 2004, PKDD.
[211] Marko Robnik-Sikonja,et al. Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF , 2004, Applied Intelligence.
[212] Ilya Nemenman. Information theory, multivariate dependence, and genetic network inference , 2004, ArXiv.
[213] Thomas Hofmann,et al. Support vector machine learning for interdependent and structured output spaces , 2004, ICML.
[214] Stefan Kramer,et al. Ensembles of nested dichotomies for multi-class problems , 2004, ICML.
[215] Y. Mansour,et al. Generalization bounds for averaged classifiers , 2004, math/0410092.
[216] A. Dawid,et al. Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory , 2004, math/0410076.
[217] Gerhard Widmer,et al. Learning in the presence of concept drift and hidden contexts , 2004, Machine Learning.
[218] Ivo Düntsch,et al. On Model Evaluation, Indexes of Importance, and Interaction Values in Rough Set Analysis , 2004, Rough-Neural Computing: Techniques for Computing with Words.
[219] Robert Tibshirani,et al. The Entire Regularization Path for the Support Vector Machine , 2004, J. Mach. Learn. Res..
[220] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.
[221] R. Baierlein. Probability Theory: The Logic of Science , 2004 .
[222] Thomas Hofmann,et al. Exponential Families for Conditional Random Fields , 2004, UAI.
[223] Nir Friedman,et al. Bayesian Network Classifiers , 1997, Machine Learning.
[224] F. Fleuret. Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..
[225] Anna Goldenberg,et al. Tractable learning of large Bayes net structures from sparse data , 2004, ICML.
[226] Marko Robnik-Sikonja,et al. Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.
[227] I. Bratko,et al. Information-based evaluation criterion for classifier's performance , 2004, Machine Learning.
[228] Takeo Kanade,et al. Maximum Entropy for Collaborative Filtering , 2004, UAI.
[229] Aleks Jakulin. Modelling Modelled∗ , 2004 .
[230] Ivan Bratko,et al. Testing the significance of attribute interactions , 2004, ICML.
[231] E. Luciano,et al. Copula methods in finance , 2004 .
[232] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[233] Andrew Gelman,et al. Standard Voting Power Indexes Do Not Work: An Empirical Analysis , 2002, British Journal of Political Science.
[234] Peter Cheeseman,et al. On The Relationship between Bayesian and Maximum Entropy Inference , 2004 .
[235] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[236] Marco Zaffalon,et al. Distribution of mutual information from complete and incomplete data , 2004, Comput. Stat. Data Anal..
[237] D. Haussler,et al. Boolean Feature Discovery in Empirical Learning , 1990, Machine Learning.
[238] Douglas H. Fisher,et al. Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.
[239] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[240] William T. Freeman,et al. Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.
[241] Ramón López de Mántaras,et al. A distance-based attribute selection measure for decision tree induction , 1991, Machine Learning.
[242] Tony Jebara,et al. Machine learning: Discriminative and generative , 2006 .
[243] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[244] Nicholas Eriksson,et al. Polyhedral conditions for the nonexistence of the MLE for hierarchical log-linear models , 2006, J. Symb. Comput..
[245] Emden R. Gansner,et al. Drawing graphs with dot , 2006 .
[246] Persi Diaconis,et al. c ○ 2007 Society for Industrial and Applied Mathematics Dynamical Bias in the Coin Toss ∗ , 2022 .
[247] Flemming Topsøe,et al. Information Theory at the Service of Science , 2007 .