Machine Learning and Data Mining in Pattern Recognition

The paper is centered on an algebraic approach to the notion of entropy and of several related concepts (conditional entropy, gain, metrics on sets of partitions produced by entropic approaches). Partitions are naturally associated with major data mining problems such as classification, clustering, data quality evaluations, and data preparation. This areas benefit from an algebraic and geometric study of metric structures defined on partitions. We discuss data mining techniques that use metrics defined on sets of partitions of finite sets such as decision tree induction, feature selection, and data discretization.

[1]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[2]  C. Sims MACROECONOMICS AND REALITY , 1977 .

[3]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[4]  Jan-Olof Eklundh,et al.  Statistical background subtraction for a mobile observer , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[5]  Alex Bavelas,et al.  Communication Patterns in Task‐Oriented Groups , 1950 .

[6]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[7]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  M. Fiedler Laplacian of graphs and algebraic connectivity , 1989 .

[9]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[12]  Ee-Peng Lim,et al.  Hierarchical text classification and evaluation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[13]  Ramanath Subramanyam,et al.  Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects , 2003, IEEE Trans. Software Eng..

[14]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[15]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[16]  S. Borgatti,et al.  The centrality of groups and classes , 1999 .

[17]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[18]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[19]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[20]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[21]  M. Rosenblatt,et al.  Multivariate k-nearest neighbor density estimates , 1979 .

[22]  Javam C. Machado,et al.  The prediction of faulty classes using object-oriented design metrics , 2001, J. Syst. Softw..

[23]  P. Bickel,et al.  Sums of Functions of Nearest Neighbor Distances, Moment Bounds, Limit Theorems and a Goodness of Fit Test , 1983 .

[24]  Cyrus Shahabi,et al.  Real-time Pattern Isolation and Recognition Over Immersive Sensor Data Streams , 2003, MMM.

[25]  David R. Karger,et al.  Tackling the Poor Assumptions of Naive Bayes Text Classifiers , 2003, ICML.

[26]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[27]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[28]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[29]  D. Lusseau,et al.  The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations , 2003, Behavioral Ecology and Sociobiology.

[30]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[31]  J. Schafer Multiple imputation: a primer , 1999, Statistical methods in medical research.

[32]  Barry Nalebuff,et al.  An Introduction to Vote-Counting Schemes , 1995 .

[33]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[34]  D. Rubinfeld,et al.  Hedonic housing prices and the demand for clean air , 1978 .

[35]  Olfa Nasraoui,et al.  A New Gravitational Clustering Algorithm , 2003, SDM.

[36]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[37]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[38]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[39]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[41]  David O'Sullivan,et al.  Geographic Information Analysis , 2002 .

[42]  Clara Pizzuti,et al.  Fast Outlier Detection in High Dimensional Spaces , 2002, PKDD.

[43]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[44]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[45]  Sudipto Guha,et al.  Clustering Data Streams: Theory and Practice , 2003, IEEE Trans. Knowl. Data Eng..

[46]  P. Hansen The truncatedSVD as a method for regularization , 1987 .

[47]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[48]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[49]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[50]  Sudipto Guha,et al.  Streaming-data algorithms for high-quality clustering , 2002, Proceedings 18th International Conference on Data Engineering.

[51]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[52]  Bernhard Schölkopf,et al.  Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.

[53]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[54]  Guoqiang Peter Zhang,et al.  Time series forecasting using a hybrid ARIMA and neural network model , 2003, Neurocomputing.

[55]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[56]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[57]  Arindam Banerjee,et al.  Clickstream clustering using weighted longest common subsequences , 2001 .

[58]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[59]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[60]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[61]  Rajeev Motwani,et al.  What can you do with a Web in your Pocket? , 1998, IEEE Data Eng. Bull..

[62]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[63]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[64]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[65]  Xiaoming Jin,et al.  Distribution Discovery: Local Analysis of Temporal Rules , 2002, PAKDD.

[66]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[67]  M. Fiedler Algebraic connectivity of graphs , 1973 .

[68]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..