Modern aspects of unsupervised learning

[1]  Yi Li,et al.  Improved bounds on the sample complexity of learning , 2000, SODA '00.

[2]  Dan Feldman,et al.  An effective coreset compression algorithm for large scale sensor networks , 2012, 2012 ACM/IEEE 11th International Conference on Information Processing in Sensor Networks (IPSN).

[3]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Shi Li,et al.  Approximating k-median via pseudo-approximation , 2012, STOC '13.

[5]  Sariel Har-Peled,et al.  Smaller Coresets for k-Median and k-Means Clustering , 2005, SCG.

[6]  Maria-Florina Balcan,et al.  Distributed Learning, Communication Complexity and Privacy , 2012, COLT.

[7]  Peter H. A. Sneath,et al.  Numerical Taxonomy: The Principles and Practice of Numerical Classification , 1973 .

[8]  Sanjeev Arora,et al.  Finding overlapping communities in social networks: toward a rigorous approach , 2011, EC '12.

[9]  Shai Ben-David,et al.  Measures of Clustering Quality: A Working Set of Axioms for Clustering , 2008, NIPS.

[10]  T. Snijders,et al.  10. Settings in Social Networks: A Measurement Model , 2003 .

[11]  M. Cosentino Lagomarsino,et al.  Hierarchy and feedback in the evolution of the Escherichia coli transcription network , 2007, Proceedings of the National Academy of Sciences.

[12]  Chris H. Q. Ding,et al.  K-means clustering via principal component analysis , 2004, ICML.

[13]  Maria-Florina Balcan,et al.  Efficient Semi-supervised and Active Learning of Disjunctions , 2013, ICML.

[14]  Jure Leskovec,et al.  Latent Multi-group Membership Graph Model , 2012, ICML.

[15]  Yingyu Liang,et al.  Distributed k-Means and k-Median Clustering on General Topologies , 2013, NIPS 2013.

[16]  Franklin T. Luk,et al.  Principal Component Analysis for Distributed Data Sets with Updating , 2005, APPT.

[17]  Benjamin King Step-Wise Clustering Procedures , 1967 .

[18]  Jeffrey Considine,et al.  Approximate aggregation techniques for sensor databases , 2004, Proceedings. 20th International Conference on Data Engineering.

[19]  Samir Khuller,et al.  Greedy strikes back: improved facility location algorithms , 1998, SODA '98.

[20]  Maria-Florina Balcan,et al.  Distributed Frank-Wolfe Algorithm: A Unified Framework for Communication-Efficient Sparse Learning , 2014, ArXiv.

[21]  Aravindan Vijayaraghavan,et al.  Bilu-Linial Stable Instances of Max Cut , 2013, arXiv.org.

[22]  Matús Mihalák,et al.  On the Complexity of the Metric TSP under Stability Considerations , 2011, SOFSEM.

[23]  Maria-Florina Balcan,et al.  Modeling and Detecting Community Hierarchies , 2013, SIMBAD.

[24]  Nathan Linial,et al.  Are Stable Instances Easy? , 2009, Combinatorics, Probability and Computing.

[25]  Maria-Florina Balcan,et al.  Clustering under Perturbation Resilience , 2011, SIAM J. Comput..

[26]  N. Samatova,et al.  Principal Component Analysis for Dimension Reduction in Massive Distributed Data Sets ∗ , 2002 .

[27]  Svetha Venkatesh,et al.  Distributed query processing for mobile surveillance , 2007, ACM Multimedia.

[28]  M. Newman,et al.  Mixing Patterns and Community Structure in Networks , 2002, cond-mat/0210146.

[29]  Le Song,et al.  Budgeted Influence Maximization for Multiple Products , 2013, 1312.2164.

[30]  Amin Saberi,et al.  A new greedy approach for facility location problems , 2002, STOC '02.

[31]  Sylvain Raybaud,et al.  Distributed Principal Component Analysis for Wireless Sensor Networks , 2008, Sensors.

[32]  John E. Hopcroft,et al.  Detecting the Structure of Social Networks Using (α, β)-Communities , 2011, WAW.

[33]  Maria-Florina Balcan,et al.  Distributed PCA and k-Means Clustering , 2013 .

[34]  Amit Kumar,et al.  Clustering with Spectral Norm and the k-Means Algorithm , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[35]  Santosh S. Vempala,et al.  Nimble Algorithms for Cloud Computing , 2013, ArXiv.

[36]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[37]  Kamesh Munagala,et al.  Local Search Heuristics for k-Median and Facility Location Problems , 2004, SIAM J. Comput..

[38]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[39]  Sanjeev Khanna,et al.  Power-conserving computation of order-statistics over sensor networks , 2004, PODS.

[40]  Sudipto Guha,et al.  A constant-factor approximation algorithm for the k-median problem (extended abstract) , 1999, STOC '99.

[41]  Hillol Kargupta,et al.  Distributed Clustering Using Collective Principal Component Analysis , 2001, Knowledge and Information Systems.

[42]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Shai Ben-David A Framework for Statistical Clustering with a Constant Time Approximation Algorithms for K-Median Clustering , 2004, COLT.

[44]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[45]  Mark Braverman,et al.  Finding Endogenously Formed Communities , 2012, SODA.

[46]  Maria-Florina Balcan,et al.  Robust hierarchical clustering , 2013, J. Mach. Learn. Res..

[47]  Pranjal Awasthi,et al.  Improved Spectral-Norm Bounds for Clustering , 2012, APPROX-RANDOM.

[48]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[49]  Jennifer Widom,et al.  Adaptive filters for continuous queries over distributed data streams , 2003, SIGMOD '03.

[50]  Jon M. Kleinberg,et al.  An Impossibility Theorem for Clustering , 2002, NIPS.

[51]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[52]  O. de Weck,et al.  Overview of metrics and their correlation patterns for multiple-metric topology analysis on heterogeneous graph ensembles. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[53]  Bin Zhang,et al.  Distributed data clustering can be efficient and exact , 2000, SKDD.

[54]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[55]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[56]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[57]  A. Dress,et al.  Weak hierarchies associated with similarity measures--an additive clustering technique. , 1989, Bulletin of mathematical biology.

[58]  Dimitris K. Tasoulis,et al.  Unsupervised distributed clustering , 2004, Parallel and Distributed Computing and Networks.

[59]  Sanjoy Dasgupta,et al.  Learning mixtures of Gaussians , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[60]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[61]  Avishek Saha,et al.  Efficient Protocols for Distributed Classification and Optimization , 2012, ALT.

[62]  Jeff M. Phillips,et al.  Relative Errors for Deterministic Low-Rank Matrix Approximations , 2013, SODA.

[63]  Qi Zhang,et al.  Approximate Clustering on Distributed Data Streams , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[64]  Farhad Shahrokhi,et al.  Sparsest cuts and bottlenecks in graphs , 1990, Discret. Appl. Math..

[65]  Michael E. Saks,et al.  On the practically interesting instances of MAXCUT , 2012, STACS.

[66]  Sergio Valcarcel Macua,et al.  Consensus-based distributed principal component analysis in wireless sensor networks , 2010, 2010 IEEE 11th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[67]  Mark Newman,et al.  Detecting community structure in networks , 2004 .

[68]  Avrim Blum,et al.  Stability Yields a PTAS for k-Median and k-Means Clustering , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[69]  H. Kriegel,et al.  Towards Effective and Efficient Distributed Clustering , 2003 .

[70]  L. Schulman,et al.  Universal ε-approximators for integrals , 2010, SODA '10.

[71]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[72]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[73]  Avrim Blum,et al.  Center-based clustering under perturbation stability , 2010, Inf. Process. Lett..

[74]  Aranyak Mehta,et al.  On Stability Properties of Economic Solution Concepts , 2006 .

[75]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[76]  Shalev Ben-David,et al.  Data stability in clustering: A closer look , 2011, Theor. Comput. Sci..

[77]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 2001, Random Struct. Algorithms.

[78]  Le Song,et al.  Influence Function Learning in Information Diffusion Networks , 2014, ICML.

[79]  David M. Mount,et al.  A local search approximation algorithm for k-means clustering , 2002, SCG '02.

[80]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[81]  Maria-Florina Balcan,et al.  Center Based Clustering: A Foundational Perspective , 2014 .

[82]  Maria-Florina Balcan,et al.  Approximate clustering without the approximation , 2009, SODA.

[83]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[84]  Sariel Har-Peled,et al.  On coresets for k-means and k-median clustering , 2004, STOC '04.

[85]  Dan Feldman,et al.  Turning big data into tiny data: Constant-size coresets for k-means, PCA and projective clustering , 2013, SODA.

[86]  Michael Langberg,et al.  A unified framework for approximating and clustering data , 2011, STOC.

[87]  Moses Charikar,et al.  Approximating min-sum k-clustering in metric spaces , 2001, STOC '01.

[88]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[89]  Mark Braverman,et al.  Approximate Nash Equilibria under Stability Conditions , 2010, ArXiv.

[90]  Niklas Carlsson,et al.  Characterizing web-based video sharing workloads , 2009, WWW '09.

[91]  H. Kargupta,et al.  K-Means Clustering over Peer-to-peer Networks , 2005 .

[92]  Marek Karpinski,et al.  Approximation schemes for clustering problems , 2003, STOC '03.

[93]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[94]  Sergei Vassilvitskii,et al.  Scalable K-Means++ , 2012, Proc. VLDB Endow..

[95]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[96]  Ke Chen,et al.  On k-Median clustering in high dimensions , 2006, SODA '06.