Mining Complex Data Generated by Collaborative Platforms

In a crowdsourcing project several participants discuss and solve one common problem, propose their ideas, evaluate ideas of each other, etc. We propose the novel instrument CrowDM for analyzing data generated by collaborative platforms. The initial version of the system combines several innovative techniques for structured and unstructured data analysis. Formal Concept Analysis, multimodal clustering and association rule mining are the key instruments for identifying patterns in object-oriented data. Keyword and colocation extraction methods are also included for mining unstructured texts. We rst describe the overall methodology underlying CrowDM and then showcase results of initial experiments on data obtained from the company Witology.

[1]  Jonas Poelmans,et al.  A New Cross-Validation Technique to Evaluate Quality of Recommender Systems , 2012, PerMIn.

[2]  Peter A. Grigoriev,et al.  Elements of an Agile Discovery Environment , 2003, Discovery Science.

[3]  Eckart Zitzler,et al.  BicAT: a biclustering analysis toolbox , 2006, Bioinform..

[4]  Jonas Poelmans,et al.  Text Mining Scientific Papers: A Survey on FCA-Based Information Retrieval Research , 2012, Industrial Conference on Data Mining.

[5]  Leonid Zhukov,et al.  From Triconcepts to Triclusters , 2011, RSFDGrC.

[6]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[7]  Sergei O. Kuznetsov,et al.  Concept-based Recommendations for Internet Advertisement , 2009, ArXiv.

[8]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[9]  Jonas Poelmans,et al.  Recommender System Based on Algorithm of Bicluster Analysis RecBi , 2012, ArXiv.

[10]  Camille Roth,et al.  Socio-semantic Dynamics in a Blog Network , 2009, 2009 International Conference on Computational Science and Engineering.

[11]  Camille Roth,et al.  Social and semantic coevolution in knowledge networks , 2010, Soc. Networks.

[12]  Yavorskiy Rostislav Research Challenges of Dynamic Socio-Semantic Networks , 2011 .

[13]  Andreas Hotho,et al.  TRIAS--An Algorithm for Mining Iceberg Tri-Lattices , 2006, Sixth International Conference on Data Mining (ICDM'06).

[14]  Camille Roth,et al.  Generalized Preferential Attachment : Towards Realistic Socio-Semantic Network Models , 2005 .