Supervised Learning with Minimal Effort

Traditional supervised learning learns from whatever training examples given to it This is dramatically different from human learning; human learns simple examples before conquering hard ones to minimize his effort Effort can equate to energy consumption, and it would be important for machine learning modules to use minimal energy in real-world deployments In this paper, we propose a novel, simple and effective machine learning paradigm that explicitly exploits this important simple-to-complex (S2C) human learning strategy, and implement it based on C4.5 efficiently Experiment results show that S2C has several distinctive advantages over the original C4.5 First of all, S2C does indeed take much less effort in learning the training examples than C4.5 which selects examples randomly Second, with minimal effort, the learning process is much more stable Finally, even though S2C only locally updates the model with minimal effort, we show that it is as accurate as the global learner C4.5 The applications of this simple-to-complex learning strategy in real-world learning tasks, especially cognitive learning tasks, will be fruitful.

[1]  Philip S. Yu,et al.  Text classification without negative examples revisit , 2006, IEEE Transactions on Knowledge and Data Engineering.

[2]  Stephen Krashen,et al.  The Input Hypothesis: Issues and Implications , 1986 .

[3]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[4]  Samuel Williams,et al.  A design methodology for domain-optimized power-efficient supercomputing , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[5]  Andrew McCallum,et al.  Piecewise pseudolikelihood for efficient training of conditional random fields , 2007, ICML '07.

[6]  Wilfred Pinfold,et al.  Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis , 2009, HiPC 2009.

[7]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[8]  Peter A. Flach,et al.  Delegating classifiers , 2004, ICML.

[9]  Peter Stone,et al.  Cross-domain transfer for reinforcement learning , 2007, ICML '07.

[10]  Miroslav Kubat,et al.  Combining Subclassifiers in Text Categorization: A DST-Based Solution and a Case Study , 2007, IEEE Transactions on Knowledge and Data Engineering.

[11]  Wu-chun Feng,et al.  Towards efficient supercomputing: a quest for the right metric , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[12]  Dong-lin,et al.  Krashen's Input Hypothesis and English classroom teaching , 2008 .

[13]  Ran El-Yaniv,et al.  Online Choice of Active Learning Algorithms , 2003, J. Mach. Learn. Res..

[14]  Marc'Aurelio Ranzato,et al.  Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[15]  Sebastian Thrun,et al.  Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.

[16]  Christophe G. Giraud-Carrier,et al.  A Note on the Utility of Incremental Learning , 2000, AI Commun..

[17]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[18]  Paul E. Utgoff,et al.  Improved Training Via Incremental Learning , 1989, ML.

[19]  Kai A. Krueger,et al.  Flexible shaping: How learning in small steps helps , 2009, Cognition.

[20]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[21]  Steffen Lange,et al.  On the power of incremental learning , 2002, Theor. Comput. Sci..

[22]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[23]  R. Gagne Conditions of Learning , 1965 .