SUPERVISED CLUSTERING WITH STRUCTURAL

[1]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[2]  Bogdan Gabrys,et al.  Combining labelled and unlabelled data in the design of pattern classification systems , 2004, Int. J. Approx. Reason..

[3]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[4]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[5]  Peter Haider,et al.  Supervised clustering of streaming data for email batch detection , 2007, ICML '07.

[6]  Claudio Gentile,et al.  Hierarchical classification: combining Bayes with SVM , 2006, ICML.

[7]  Rahul Gupta,et al.  Accurate max-margin training for structured output spaces , 2008, ICML '08.

[8]  Philip S. Yu,et al.  On the merits of building categorization systems by supervised clustering , 1999, KDD '99.

[9]  Dale Schuurmans,et al.  Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling , 2006, ACL.

[10]  Ben Taskar,et al.  An End-to-End Discriminative Approach to Machine Translation , 2006, ACL.

[11]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[12]  Mikhail Belkin,et al.  Maximum Margin Semi-Supervised Learning for Structured Variables , 2005, NIPS 2005.

[13]  Dan Roth,et al.  The Use of Classifiers in Sequential Inference , 2001, NIPS.

[14]  R. Zemel,et al.  Multiscale conditional random fields for image labeling , 2004, CVPR 2004.

[15]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[16]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[17]  William W. Cohen,et al.  Learning to Match and Cluster Entity Names , 2001 .

[18]  Daniel Marcu,et al.  A Bayesian Model for Supervised Clustering with the Dirichlet Process Prior , 2005, J. Mach. Learn. Res..

[19]  Ben Taskar,et al.  Learning associative Markov networks , 2004, ICML.

[20]  Daniel Marcu,et al.  Practical structured learning techniques for natural language processing , 2006 .

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  Michael I. Jordan,et al.  Learning Spectral Clustering, With Application To Speech Separation , 2006, J. Mach. Learn. Res..

[23]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[25]  Claire Cardie,et al.  Noun Phrase Coreference as Clustering , 1999, EMNLP.

[26]  Toshihiro Kamishima,et al.  Learning from Cluster Examples , 2003, Machine Learning.

[27]  Martial Hebert,et al.  Discriminative random fields: a discriminative framework for contextual interaction in classification , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[28]  Thorsten Joachims,et al.  Predicting diverse subsets using structural SVMs , 2008, ICML '08.

[29]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[30]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[31]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[32]  Andrew McCallum,et al.  Fast, Piecewise Training for Discriminative Finite-state and Parsing Models , 2005 .

[33]  Dan Roth,et al.  Probabilistic Reasoning for Entity & Relation Recognition , 2002, COLING.

[34]  Endre Boros,et al.  Pseudo-Boolean optimization , 2002, Discret. Appl. Math..

[35]  Philippe Rigollet,et al.  Generalization Error Bounds in Semi-supervised Classification Under the Cluster Assumption , 2006, J. Mach. Learn. Res..

[36]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[37]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[38]  Thorsten Joachims,et al.  Learning to Align Sequences: A Maximum-Margin Approach , 2006 .

[39]  Gurmeet Singh,et al.  MRF's forMRI's: Bayesian Reconstruction of MR Images via Graph Cuts , 2006, CVPR.

[40]  Dan Roth,et al.  Integer linear programming inference for conditional random fields , 2005, ICML.

[41]  Chaitanya Swamy,et al.  Correlation Clustering: maximizing agreements via semidefinite programming , 2004, SODA '04.

[42]  Andrew McCallum,et al.  First-Order Probabilistic Models for Coreference Resolution , 2007, NAACL.

[43]  Jon M Kleinberg,et al.  Hubs, authorities, and communities , 1999, CSUR.

[44]  Nello Cristianini,et al.  Efficiently Learning the Metric with Side-Information , 2003, ALT.

[45]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[46]  Ben Taskar,et al.  Word Alignment via Quadratic Assignment , 2006, NAACL.

[47]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[48]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[49]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[50]  Inderjit S. Dhillon,et al.  Semi-supervised graph clustering: a kernel approach , 2005, ICML '05.

[51]  Thorsten Joachims,et al.  Support Vector Training of Protein Alignment Models , 2007, RECOMB.

[52]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[53]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[54]  Surajit Ray,et al.  A Nonparametric Statistical Approach to Clustering via Mode Identification , 2007, J. Mach. Learn. Res..

[55]  Ivor W. Tsang,et al.  Distance metric learning with kernels , 2003 .

[56]  Martial Hebert,et al.  Discriminative Fields for Modeling Spatial Dependencies in Natural Images , 2003, NIPS.

[57]  Michael I. Jordan,et al.  Learning Spectral Clustering , 2003, NIPS.

[58]  Thorsten Joachims,et al.  Supervised clustering with support vector machines , 2005, ICML.

[59]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[60]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[61]  Thorsten Joachims,et al.  Error bounds for correlation clustering , 2005, ICML.

[62]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[63]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[64]  Nicole Immorlica,et al.  Approximation, Randomization, and Combinatorial Optimization.. Algorithms and Techniques , 2003, Lecture Notes in Computer Science.

[65]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[66]  Fernando Pereira,et al.  Structured Learning with Approximate Inference , 2007, NIPS.

[67]  Thorsten Joachims,et al.  Training structural svms with kernels using sampled cuts , 2008, KDD.

[68]  Thomas Hofmann,et al.  Hidden Markov Support Vector Machines , 2003, ICML.

[69]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[70]  Thorsten Joachims,et al.  Supervised k-Means Clustering , 2008 .

[71]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[72]  Nathan Srebro,et al.  SVM optimization: inverse dependence on training set size , 2008, ICML '08.

[73]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[74]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[75]  Ulf Brefeld,et al.  Semi-supervised learning for structured output variables , 2006, ICML.

[76]  G. Rota The Number of Partitions of a Set , 1964 .

[77]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[78]  Andrew McCallum,et al.  Efficient clustering of high-dimensional data sets with application to reference matching , 2000, KDD '00.

[79]  Dan Roth Reasoning with Classifiers , 2002, ECML.

[80]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.

[81]  Andrew McCallum,et al.  Semi-Supervised Clustering with User Feedback , 2003 .

[82]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[83]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[84]  Dan Roth,et al.  The Necessity of Syntactic Parsing for Semantic Role Labeling , 2005, IJCAI.

[85]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[86]  Matthias Seeger,et al.  Learning from Labeled and Unlabeled Data , 2010, Encyclopedia of Machine Learning.

[87]  Dale Schuurmans,et al.  Maximum Margin Bayesian Networks , 2005, UAI.

[88]  Vladimir Kolmogorov,et al.  Minimizing Nonsubmodular Functions with Graph Cuts-A Review , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[89]  Inderjit S. Dhillon,et al.  Iterative clustering of high dimensional text data augmented by local search , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[90]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.

[91]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[92]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[93]  Mark W. Schmidt,et al.  Accelerated training of conditional random fields with stochastic gradient methods , 2006, ICML.

[94]  Pierre Hansen,et al.  Roof duality, complementation and persistency in quadratic 0–1 optimization , 1984, Math. Program..

[95]  Ben Taskar,et al.  Discriminative learning of Markov random fields for segmentation of 3D scan data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[96]  Nikhil Bansal,et al.  Correlation Clustering , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[97]  Ben Taskar,et al.  Alignment by Agreement , 2006, NAACL.

[98]  Andrew McCallum,et al.  Toward Conditional Models of Identity Uncertainty with Application to Proper Noun Coreference , 2003, IIWeb.

[99]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[100]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[101]  Dan Roth,et al.  Learning and Inference over Constrained Output , 2005, IJCAI.

[102]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[103]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..