Intelligible machine learning with malibu

malibu is an open-source machine learning work-bench developed in C/C++ for high-performance real-world applications, namely bioinformatics and medical informatics. It leverages third-party machine learning implementations for more robust bug free software. This workbench handles several well-studied supervised machine learning problems including classification, regression, importance-weighted classification and multiple-instance learning. The malibu interface was designed to create reproducible experiments ideally run in a remote and/or command line environment. The software can be found at: http://proteomics.bioengr. uic.edu/malibu/index.html

[1]  N. Bhardwaj,et al.  Kernel-based machine learning protocol for predicting DNA-binding proteins , 2005, Nucleic acids research.

[2]  John Langford,et al.  Beating the hold-out: bounds for K-fold and progressive cross-validation , 1999, COLT '99.

[3]  Robert E. Langlois,et al.  Machine learning in bioinformatics: Algorithms, implementations and applications. , 2008 .

[4]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[5]  Hui Lu,et al.  Improved protein fold assignment using support vector machines , 2005, Int. J. Bioinform. Res. Appl..

[6]  P. Bühlmann Bagging, subagging and bragging for improving some prediction algorithms , 2003 .

[7]  Klaus Obermayer,et al.  A Topographic Support Vector Machine: Classification Using Local Label Configurations , 2004, NIPS.

[8]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[9]  Ian Witten,et al.  Data Mining , 2000 .

[10]  N. Bhardwaj,et al.  Learning to Translate Sequence and Structure to Function: Identifying DNA Binding and Membrane Binding Proteins , 2007, Annals of Biomedical Engineering.

[11]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[12]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[13]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[14]  Samy Bengio,et al.  Torch: a modular machine learning software library , 2002 .

[15]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[16]  Bianca Zadrozny,et al.  Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[17]  Yoav Freund,et al.  The Alternating Decision Tree Learning Algorithm , 1999, ICML.

[18]  Wray L. Buntine,et al.  Learning classification trees , 1992 .

[19]  Ingo Mierswa,et al.  YALE: rapid prototyping for complex data mining tasks , 2006, KDD '06.

[20]  N. Bhardwaj,et al.  Residue‐level prediction of DNA‐binding sites and its application on DNA‐binding protein predictions , 2007, FEBS letters.

[21]  Klaus-Robert Müller,et al.  Machine learning for real-time single-trial EEG-analysis: From brain–computer interfacing to mental state monitoring , 2008, Journal of Neuroscience Methods.

[22]  Carl E. Rasmussen,et al.  The Need for Open Source Software in Machine Learning , 2007, J. Mach. Learn. Res..

[23]  Nitin Bhardwaj,et al.  Structural bioinformatics prediction of membrane-binding proteins. , 2006, Journal of molecular biology.

[24]  John Langford,et al.  Estimating Class Membership Probabilities using Classifier Learners , 2005, AISTATS.

[25]  John Langford,et al.  Cost-sensitive learning by cost-proportionate example weighting , 2003, Third IEEE International Conference on Data Mining.

[26]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[27]  Robert C. Holte,et al.  Cost curves: An improved method for visualizing classifier performance , 2006, Machine Learning.

[28]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[29]  Blaz Zupan,et al.  Orange: From Experimental Machine Learning to Interactive Data Mining , 2004, PKDD.