Challenges for the Application of Machine Learning

By most accounts, the applied branch of machine learning has been a clear success. Induction techniques have aided the development of elded systems in science and industry, on a range of tasks, including me-searchers in the area can feel genuinely proud that their algorithms have proven so robust and developers deserve major credit for identifying promising applications and seeing them through to completion. The basic development story should by now be quite The developer works with a domain expert or user to understand some problem, and then reformulates the problem into one that can be addressed by well-established methods for supervised learning. He then determines a set of likely features to describe the training cases, and devises an approach to collecting and preparing those data. Once these are available, he runs some induction method over the data. The developer (and possibly the expert) then evaluate the resulting knowledge base along dimensions of interest, such as accuracy, understandability, and consistency with existing domain knowledge. If the result seems acceptable, they attempt to deploy the learned knowledge in the eld. 1 Applied work in machine learning diiers from academic learning research in its acknowledgement of this development process. Most research papers on learning continue to emphasize reenements in the induction technique and ignore the steps that must occur 1 Of course, this process is not linear but iterative. Problems at any step can lead the developer to backtrack to an earlier stage and revisit decisions made there. before and after its invocation. In contrast, applied efforts recognize the importance of problem formulation, representation engineering, data collection and preparation , inspection of the learned knowledge, and the elding process itself. Within the applications community , there is an emerging consensus that these steps play a role at least as important as the induction stage itself. Indeed, there is even a common belief that, once they are handled, the particular induction method one uses has little eeect on the outcome. 2 Automating the Overall Process Clearly, the applied induction community could continue along these lines and be quite successful. The discipline could use its established approach to develop more elded applications and train a cadre of problem and representation engineers to become expert at the overall process. Over time, this new generation of developers would come to replace the knowledge engineers currently charged with creating knowledge-based systems. But this is a limited vision, and …