The Divide-and-Conquer Manifesto

Existing machine learning theory and algorithms have focused on learning an unknown function from training examples, where the unknown function maps from a feature vector to one of a small number of classes. Emerging applications in science and industry require learning much more complex functions that map from complex input spaces (e.g., 2-dimensional maps, time series, and strings) to complex output spaces (e.g., other 2-dimensional maps, time series, and strings). Despite the lack of theory covering such cases, many practical systems have been built that work well in particular applications. These systems all employ some form of divide-and-conquer, where the inputs and outputs are divided into smaller pieces (e.g., "windows"), classified, and then the results are merged to produce an overall solution. This paper defines the problem of divide-and-conquer learning and identifies the key research questions that need to be studied in order to develop practical, general-purpose learning algorithms for divide-and-conquer problems and an associated theory.

[1]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[2]  R. I. I. Damper,et al.  Data Mining Techniques in Speech Synthesis , 1998 .

[3]  J. B. Kernan,et al.  An Information‐Theoretic Approach* , 1971 .

[4]  Lluís Màrquez i Villodre,et al.  Part-of-speech Tagging: A Machine Learning Approach based on Decision Trees , 1999 .

[5]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[6]  Yoshua Bengio,et al.  Global optimization of a neural network-hidden Markov model hybrid , 1992, IEEE Trans. Neural Networks.

[7]  Gregory R. Grant,et al.  Bioinformatics - The Machine Learning Approach , 2000, Comput. Chem..

[8]  Robert L. Mercer,et al.  An information theoretic approach to the automatic determination of phonemic baseforms , 1984, ICASSP.

[9]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[10]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[11]  Yoshua Bengio,et al.  Globally Trained Handwritten Word Recognizer Using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models , 1993, NIPS.

[12]  Thomas G. Dietterich,et al.  Converting English text to speech: a machine learning approach , 1991 .

[13]  Thomas G. Dietterich,et al.  Achieving High-Accuracy Text-to-Speech with Machine Learning , 1997 .

[14]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[15]  Waranun Bunjongsat,et al.  Grasshopper Infestation Prediction: An Application of Data Mining to Ecological Modeling , 1999 .

[16]  A. Waibel,et al.  Connectionist Viterbi training: a new hybrid method for continuous speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[17]  Handong Wang,et al.  Alignment Algorithms for Learning to Read Aloud , 1997, IJCAI.