DAME: A Web Oriented Infrastructure for Scientific Data Mining & Exploration

Nowadays, many scientific areas share the same need of being able to deal with massive and distributed datasets and to perform on them complex knowledge extraction tasks. This simple consideration is behind the international efforts to build virtual organizations such as, for instance, the Virtual Observatory (VObs). DAME (DAta Mining & Exploration) is an innovative, general purpose, Web-based, VObs compliant, distributed data mining infrastructure specialized in Massive Data Sets exploration with machine learning methods. Initially fine tuned to deal with astronomical data only, DAME has evolved in a general purpose platform which has found applications also in other domains of human endeavor. We present the products and a short outline of a science case, together with a detailed description of main features available in the beta release of the web application now released.

[1]  Massimo Brescia,et al.  GRID-Launcher v.1.0 , 2008, ArXiv.

[2]  Margaret H. Dunham,et al.  Data Mining: Introductory and Advanced Topics , 2002 .

[3]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[4]  Lotfi A. Zadeh,et al.  Fuzzy logic, neural networks, and soft computing , 1993, CACM.

[5]  Aleksandar Seovic Google Web Toolkit GWT Java AJAX Programming , 2007 .

[6]  Francois Ochsenbein,et al.  Interoperability of archives in the VO , 2002, SPIE Astronomical Telescopes + Instrumentation.

[7]  Bruce Jay Nelson Remote procedure call , 1981 .

[8]  Jesse James Garrett Ajax: A New Approach to Web Applications , 2007 .

[9]  S. G. Djorgovski,et al.  Recommendations of the Virtual Astronomical Observatory (VAO) Science Council for the VAO second year activity , 2011 .

[10]  Massimo Brescia,et al.  Mining Knowledge in Astrophysical Massive Data Sets , 2010, ArXiv.

[11]  Mark Lycett,et al.  Service-oriented architecture , 2003, 2003 Symposium on Applications and the Internet Workshops, 2003. Proceedings..

[12]  Z. Ivezic,et al.  RECOMMENDATIONS OF THE VAO-SCIENCE COUNCIL , 2010, 1006.2168.

[13]  Sam Ruby,et al.  RESTful Web Services , 2007 .

[14]  John Footen,et al.  Service-Oriented Architecture and Cloud Computing in the Media Industry , 2012 .

[15]  Elliotte Rusty Harold Processing XML with Java: A Guide to Sax, Dom, Jdom, Jaxp, and Trax , 2002 .

[16]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[17]  Georgios Paliouras,et al.  Scalability Of Machine Learning Algorithms , 1993 .

[18]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[19]  Jorge Nocedal,et al.  Representations of quasi-Newton matrices and their use in limited memory methods , 1994, Math. Program..

[20]  N. A. Walton,et al.  Quasar candidates selection in the Virtual Observatory era , 2008, 0805.0156.

[21]  Thorsten Meinl,et al.  KNIME: The Konstanz Information Miner , 2007, GfKl.

[22]  Thomas Hofmann,et al.  Map-Reduce for Machine Learning on Multicore , 2007 .

[23]  Massimo Brescia,et al.  Astrophysics in S.Co.P.E , 2008, ArXiv.

[24]  Massimo Brescia,et al.  DAME : A Distributed Web Based Framework for Knowledge Discovery in Databases , 2010 .

[25]  M. Paolillo,et al.  Probing the Low Mass X‐ray Binaries/Globular Cluster connection in NGC1399 , 2010 .

[26]  A. Jordán The ACS Virgo Cluster Survey , 2003 .

[27]  Giovanni Toffetti Carughi,et al.  Engineering rich internet applications with a model-driven approach , 2010, TWEB.

[28]  Paul Goudfrooij,et al.  PROBING THE GC-LMXB CONNECTION IN NGC 1399: A WIDE-FIELD STUDY WITH THE HUBBLE SPACE TELESCOPE AND CHANDRA , 2011, 1105.2561.

[29]  D. Shanno Conditioning of Quasi-Newton Methods for Function Minimization , 1970 .

[30]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[31]  G. Longo,et al.  Mining the SDSS Archive. I. Photometric Redshifts in the Nearby Universe , 2007 .