Collaborative Analytics with Genetic Programming for Workflow Recommendation

Formulation of appropriate data analytics workflows requires intricate knowledge and rich experiences of data analytics experts. This problem is further compounded by continuous advancement and improvement in analytical algorithms. In this paper, a generic non-domain specific solution for the creation of appropriate workflows targeted at supervised learning problems is proposed. Our adaptive workflow recommendation engine based on collaborative analytics matches analytics needs with relevant workflows in repository. It is capable of picking workflows with better performance as compared to randomly selected workflows. The recommendation engine is now augmented by a workflow optimizer that applies genetic programming to further improve the recommended workflows through iterative evolution, leading to better alternative workflows. This unique Collaborative Analytics Recommender System is tested on seven UCI benchmark datasets. It is shown that the final workflows produced by the system could closely approximate, in terms of accuracy, the best workflows that analytics experts could possibly design.

[1]  Paul Davidsson,et al.  Quantifying the Impact of Learning Algorithm Parameter Tuning , 2006, AAAI.

[2]  Nada Lavrac,et al.  Automating Knowledge Discovery Workflow Composition Through Ontology-Based Planning , 2011, IEEE Transactions on Automation Science and Engineering.

[3]  Saso Dzeroski,et al.  Ranking with Predictive Clustering Trees , 2002, ECML.

[4]  Alexandros Kalousis,et al.  Algorithm selection via meta-learning , 2002 .

[5]  Hilan Bensusan,et al.  Estimating the Predictive Accuracy of a Classifier , 2001, ECML.

[6]  Bu-Sung Lee,et al.  Collaborative analytics for predicting expressway-traffic congestion , 2012, ICEC '12.

[7]  Abraham Bernstein,et al.  Toward intelligent assistance for a data mining process: an ontology-based approach for cost-sensitive classification , 2005, IEEE Transactions on Knowledge and Data Engineering.

[8]  Kalousis Alexandros,et al.  MODEL SELECTION VIA META-LEARNING: A COMPARATIVE STUDY , 2001 .

[9]  Melanie Hilario,et al.  Model selection via meta-learning: a comparative study , 2000, Proceedings 12th IEEE Internationals Conference on Tools with Artificial Intelligence. ICTAI 2000.

[10]  Morton Nadler,et al.  Pattern recognition engineering , 1993 .

[11]  Abraham Bernstein,et al.  A survey of intelligent assistants for data analysis , 2013, CSUR.

[12]  Abraham Bernstein,et al.  The NExT System: Towards True Dynamic Adaptations of Semantic Web Service Compositions , 2007, ESWC.

[13]  T. Breuel,et al.  Pattern Recognition Engineering , 2010 .

[14]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[15]  Claudia Diamantini,et al.  Ontology-Driven KDD Process Composition , 2009, IDA.

[16]  Anna Maria Fanelli,et al.  Meta-learning Experiences with the Mindful System , 2005, CIS.

[17]  Christophe G. Giraud-Carrier,et al.  The data mining advisor: meta-learning at the service of practitioners , 2005, Fourth International Conference on Machine Learning and Applications (ICMLA'05).