Artificial Prediction Markets for Clustering

There exist a lot of clustering algorithms for different purposes. But there is no general algorithm that can work without considering the context. This means clustering is not an application independent problem. So there is a need for more flexible frameworks to engineer new clustering algorithms for the problems at hand. One way to do this is by combining clustering algorithms. This is also called consensus or ensemble clustering in the literature. This paper presents a framework based on prediction markets mechanism for online clustering by combining different clustering algorithms. In real world, prediction markets are used to aggregate wisdom of the crowd for predicting outcome of events such as presidential election. By using the prediction markets mechanism and considering clustering algorithms as agents or market participants, an artificial prediction market is designed. Here clustering is viewed as a prediction problem. Beside working online, the proposed method provides flexibility in combining algorithms and also helps in tracking their performance in the market. Based on this framework an algorithm for center-based clustering algorithms (like k-means) is proposed. The first set of experiments show the flexibility of the algorithm on synthetic datasets. The results from the second set of experiments show that the algorithm also works well on real-world datasets.

[1]  Ana L. N. Fred,et al.  Probabilistic consensus clustering using evidence accumulation , 2013, Machine Learning.

[2]  Marcílio Carlos Pereira de Souto,et al.  Impact of Base Partitions on Multi-objective and Traditional Ensemble Clustering Algorithms , 2015, ICONIP.

[3]  Ana L. N. Fred,et al.  Data clustering using evidence accumulation , 2002, Object recognition supported by user interaction for service robots.

[4]  Anna Choromanska,et al.  Online Clustering with Experts , 2012, AISTATS.

[5]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[6]  R. Hanson LOGARITHMIC MARKETS CORING RULES FOR MODULAR COMBINATORIAL INFORMATION AGGREGATION , 2012 .

[7]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[8]  Steven Gjerstad Risk Aversion, Beliefs, and Prediction Market Equilibrium , 2004 .

[9]  Sandro Vega-Pons,et al.  A Survey of Clustering Ensemble Algorithms , 2011, Int. J. Pattern Recognit. Artif. Intell..

[10]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[11]  Amos J. Storkey,et al.  Machine Learning Markets , 2011, AISTATS.

[12]  Isabelle Guyon,et al.  Clustering: Science or Art? , 2009, ICML Unsupervised and Transfer Learning.

[13]  Nathan Lay,et al.  An introduction to artificial prediction markets for classification , 2011, J. Mach. Learn. Res..

[14]  C. Manski Interpreting the Predictions of Prediction Markets , 2004 .

[15]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[16]  David M. Pennock,et al.  Designing Markets for Prediction , 2010, AI Mag..

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  Arindam Banerjee,et al.  Bayesian cluster ensembles , 2011, Stat. Anal. Data Min..

[19]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.