Journeys to Data Mining: Experiences from 15 Renowned Researchers
暂无分享,去创建一个
[1] Geoffrey J. McLachlan,et al. A note on the choice of a weighting function to give an efficient method for estimating the probability of misclassification , 1977, Pattern Recognit..
[2] G. J. McLachlan,et al. 9 The classification and mixture maximum likelihood approaches to cluster analysis , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.
[3] Geoffrey J. McLachlan,et al. Analyzing Microarray Gene Expression Data , 2004 .
[4] Ran Wolff,et al. Distributed Decision-Tree Induction in Peer-to-Peer Systems , 2008 .
[5] Yehuda Koren,et al. The BellKor solution to the Netflix Prize , 2007 .
[6] Geoffrey J. McLachlan,et al. Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution , 2007, Comput. Stat. Data Anal..
[7] J. S. Marron,et al. Geometric representation of high dimension, low sample size data , 2005 .
[8] G. McLachlan. Estimation of the Errors of Misclassification on the Criterion of Asymptotic Mean Square Error , 1974 .
[9] Geoffrey J. McLachlan,et al. Mixtures of Factor Analyzers , 2000, International Conference on Machine Learning.
[10] Qi Wang,et al. On the privacy preserving properties of random data perturbation techniques , 2003, Third IEEE International Conference on Data Mining.
[11] Geoffrey J. McLachlan,et al. On a general method for matrix factorisation applied to supervised classification , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop.
[12] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[13] Trevor Hastie,et al. Regularized linear discriminant analysis and its application in microarrays. , 2007, Biostatistics.
[14] Gene H. Golub,et al. Matrix computations , 1983 .
[15] Kui Wang,et al. A Mixture model with random-effects components for clustering correlated gene-expression profiles , 2006, Bioinform..
[16] G. McLachlan. Iterative Reclassification Procedure for Constructing An Asymptotically Optimal Rule of Allocation in Discriminant-Analysis , 1975 .
[17] G. McLachlan. Estimating the Linear Discriminant Function from Initial Samples Containing a Small Number of Unclassified Observations , 1977 .
[18] G. McLachlan,et al. The efficiency of a linear discriminant function based on unclassified initial samples , 1978 .
[19] The errors of allocation and their estimators in the two-population discrimination problem , 1973, Bulletin of the Australian Mathematical Society.
[20] Geoffrey J. McLachlan,et al. Modelling high-dimensional data by mixtures of factor analyzers , 2003, Comput. Stat. Data Anal..
[21] Geoffrey J. McLachlan,et al. A simple implementation of a normal mixture approach to differential gene expression in multiclass microarrays , 2006, Bioinform..
[22] M WojtekKowalczyk,et al. Towards Data Mining in Large and Fully Distributed Peer-to-Peer Overlay Networks , 2003 .
[23] G. J. McLachlan,et al. Correcting for selection bias via cross-validation in the classification of microarray data , 2008, 0805.2501.
[24] R. Tibshirani,et al. Least angle regression , 2004, math/0406456.
[25] G. McLachlan. The relationship in terms of asymptotic mean square error between the separate problems of estimating each of the three types of error rate of the linear discriminant function , 1974 .
[26] F. Marriott. The interpretation of multiple observations , 1974 .
[27] Geoffrey E. Hinton,et al. Modeling the manifolds of images of handwritten digits , 1997, IEEE Trans. Neural Networks.
[28] Jason Weston,et al. Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.
[29] Mark J. van der Laan,et al. Statistics Ready for a Revolution: Next Generation of Statisticians Must Build Tools for Massive Data Sets , 2010 .
[30] Theofanis Sapatinas,et al. Discriminant Analysis and Statistical Pattern Recognition , 2005 .
[31] Tom M Mitchell,et al. Mining Our Reality , 2009, Science.
[32] Vladimir Nikulin,et al. Penalized Principal Component Analysis of Microarray Data , 2009, CIBB.
[33] Hillol Kargupta,et al. Distributed probabilistic inferencing in sensor networks using variational approximation , 2008, J. Parallel Distributed Comput..
[34] Ilker Hamzaoglu,et al. Scalable, Distributed Data Mining - An Agent Architecture , 1997, KDD.
[35] Yoshua Bengio,et al. Pattern Recognition and Neural Networks , 1995 .
[36] I. Johnstone,et al. On Consistency and Sparsity for Principal Components Analysis in High Dimensions , 2009, Journal of the American Statistical Association.
[37] G. McLachlan. The bias of the apparent error rate in discriminant analysis , 1976 .
[38] Kun Liu,et al. Distributed Identification of Top-l Inner Product Elements and its Application in a Peer-to-Peer Network , 2008, IEEE Transactions on Knowledge and Data Engineering.
[39] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.
[40] Jianqing Fan,et al. Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.
[41] D. Hand,et al. Idiot's Bayes—Not So Stupid After All? , 2001 .
[42] R. Tibshirani,et al. Empirical bayes methods and false discovery rates for microarrays , 2002, Genetic epidemiology.
[43] Kirk D. Borne,et al. PADMINI: A Peer-to-Peer Distributed Astronomy Data Mining System and a Case Study , 2010, CIDU.
[44] G. McLachlan. Asymptotic Results for Discriminant Analysis When the Initial Samples are Misclassified , 1972 .
[45] Ujjwal Maulik,et al. Clustering distributed data streams in peer-to-peer environments , 2006, Inf. Sci..
[46] Kun Liu,et al. A Survey of Attack Techniques on Privacy-Preserving Data Perturbation Methods , 2008, Privacy-Preserving Data Mining.
[47] Kun Liu,et al. Random projection-based multiplicative data perturbation for privacy preserving distributed data mining , 2006, IEEE Transactions on Knowledge and Data Engineering.
[48] Geoffrey J. McLachlan,et al. A Very Fast Algorithm for Matrix Factorization , 2010, ArXiv.
[49] M. Kenward,et al. Contribution to the discussion of the paper by Diggle, Tawn and Moyeed , 1998 .
[50] Hillol Kargupta,et al. A Scalable Local Algorithm for Distributed Multivariate Regression , 2008, Stat. Anal. Data Min..
[51] G T Toussaint,et al. An efficient method for estimating the probability of misclassification applied to a problem in medical diagnosis. , 1975, Computers in biology and medicine.
[52] J. Friedman. Regularized Discriminant Analysis , 1989 .
[53] Hillol Kargupta,et al. TR-CS _ 01 _ 07 A Game Theoretic Approach toward Multi-Party Privacy-Preserving Distributed Data Mining , 2007 .
[54] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.
[55] Philip K. Chan,et al. Advances in Distributed and Parallel Knowledge Discovery , 2000 .
[56] R. Fisher. THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .
[57] M. R. Mickey,et al. Estimation of Error Rates in Discriminant Analysis , 1968 .
[58] Hillol Kargupta,et al. Toward ubiquitous mining of distributed data , 2001, SPIE Defense + Commercial Sensing.
[59] Christophe Ambroise,et al. Selection bias in working with the top genes in supervised classification of tissue samples , 2006 .
[60] Terence J. O'Neill. Normal Discrimination with Unclassified Observations , 1978 .
[61] G. McLachlan,et al. The EM algorithm and extensions , 1996 .
[62] P. Hall,et al. Tilting methods for assessing the influence of components in a classifier , 2009 .
[63] Trevor Hastie,et al. Neural Networks and Related Methods for Classification - Discussion , 1994 .
[64] R. Tibshirani,et al. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.
[65] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.
[66] Jianqing Fan,et al. High Dimensional Classification Using Features Annealed Independence Rules. , 2007, Annals of statistics.
[67] Michael F Ochs,et al. Matrix factorization for recovery of biological processes from microarray data. , 2009, Methods in enzymology.
[68] D. N. Geary. Mixture Models: Inference and Applications to Clustering , 1989 .
[69] G. McLachlan. An Asymptotic Unbiased Technique for Estimating the Error Rates in Discriminant Analysis , 1974 .
[70] Shili Lin,et al. Feature selection in finite mixture of sparse normal linear models in high-dimensional feature space. , 2011, Biostatistics.
[71] A. Raftery,et al. Variable Selection for Model-Based Clustering , 2006 .
[72] Yelena Yesha,et al. Data Mining: Next Generation Challenges and Future Directions , 2004 .
[73] Geoffrey J. McLachlan,et al. Robust Cluster Analysis via Mixtures of Multivariate t-Distributions , 1998, SSPR/SPR.
[74] M. Hills. Allocation Rules and Their Error Rates , 1966 .
[75] Peter Adams,et al. The EMMIX software for the fitting of mixtures of normal and t-components , 1999 .
[76] Daniel Q. Naiman,et al. Classifying Gene Expression Profiles from Pairwise mRNA Comparisons , 2004, Statistical applications in genetics and molecular biology.
[77] Jianqing Fan,et al. A Selective Overview of Variable Selection in High Dimensional Feature Space. , 2009, Statistica Sinica.
[78] Geoffrey J. McLachlan,et al. Using the EM algorithm to train neural networks: misconceptions and a new algorithm for multiclass classification , 2004, IEEE Transactions on Neural Networks.
[79] Geoffrey J. McLachlan,et al. Robust mixture modelling using the t distribution , 2000, Stat. Comput..
[80] Jill P. Mesirov,et al. Automated High-Dimensional Flow Cytometric Data Analysis , 2010, RECOMB.
[81] Trevor Hastie,et al. Class Prediction by Nearest Shrunken Centroids, with Applications to DNA Microarrays , 2003 .
[82] Haimonti Dutta,et al. TagLearner: A P2P Classifier Learning System from Collaboratively Tagged Text Documents , 2009, 2009 IEEE International Conference on Data Mining Workshops.
[83] Hillol Kargupta,et al. Approximate Distributed K-Means Clustering over a Peer-to-Peer Network , 2009, IEEE Transactions on Knowledge and Data Engineering.