Learning figures with the Hausdorff metric by fractals—towards computable binary classification

We present learning of figures, nonempty compact sets in Euclidean space, based on Gold’s learning model aiming at a computable foundation for binary classification of multivariate data. Encoding real vectors with no numerical error requires infinite sequences, resulting in a gap between each real vector and its discretized representation used for the actual machine learning process. Our motivation is to provide an analysis of machine learning problems that explicitly tackles this aspect which has been glossed over in the literature on binary classification as well as in other machine learning tasks such as regression and clustering. In this paper, we amalgamate two processes: discretization and binary classification. Each learning target, the set of real vectors classified as positive, is treated as a figure. A learning machine receives discretized vectors as input data and outputs a sequence of discrete representations of the target figure in the form of self-similar sets, known as fractals. The generalization error of each output is measured by the Hausdorff metric. Using this learning framework, we reveal a hierarchy of learnable classes under various learning criteria in the track of traditional analysis based on Gold’s learning model, and show a mathematical connection between machine learning and fractal geometry by measuring the complexity of learning using the Hausdorff dimension and the VC dimension. Moreover, we analyze computability aspects of learning of figures using the framework of Type-2 Theory of Effectivity (TTE).

[1]  Thomas Zeugmann,et al.  Learning recursive functions: A survey , 2008, Theor. Comput. Sci..

[2]  Setsuo Arikawa,et al.  Inferability of Recursive Real-Valued Functions , 1997, ALT.

[3]  Keith Wright Identification of unions of languages drawn from an identifiable class , 1989, COLT '89.

[4]  Michael F. Barnsley,et al.  Fractals everywhere , 1988 .

[5]  Boris A. Trakhtenbrot,et al.  Finite automata : behavior and synthesis , 1973 .

[6]  Klaus Weihrauch,et al.  Turing machines on represented sets, a model of computation for Analysis , 2011, Log. Methods Comput. Sci..

[7]  Matthias Schröder,et al.  Extended admissibility , 2002, Theor. Comput. Sci..

[8]  Rosario Gennaro,et al.  On learning from noisy and incomplete examples , 1995, COLT '95.

[9]  E. Mark Gold,et al.  Limiting recursion , 1965, Journal of Symbolic Logic.

[10]  Setsuo Arikawa,et al.  A comparison of identification criteria for inductive inference of recursive real-valued functions , 1998, Theor. Comput. Sci..

[11]  A. Kechris Classical descriptive set theory , 1987 .

[12]  S. D. Chatterji Proceedings of the International Congress of Mathematicians , 1995 .

[13]  Dana Angluin,et al.  Inductive Inference of Formal Languages from Positive Data , 1980, Inf. Control..

[14]  Yamamoto Akihiro,et al.  The Coding Divergence for Measuring the Complexity of Separating Two Sets , 2010 .

[15]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[16]  E. Maier New Frontiers in Artificial Intelligence , 2009 .

[17]  Vasco Brattka,et al.  Computability on subsets of metric spaces , 2003, Theor. Comput. Sci..

[18]  Klaus Weihrauch,et al.  Elementary Computable Topology , 2009, J. Univers. Comput. Sci..

[19]  Klaus Weihrauch,et al.  Computability on Subsets of Euclidean Space I: Closed and Compact Subsets , 1999, Theor. Comput. Sci..

[20]  Eliana Minicozzi,et al.  Some Natural Properties of Strong-Identification in Inductive Inference , 1976, Theor. Comput. Sci..

[21]  Frank Stephan,et al.  Refuting Learning Revisited , 2001, Theor. Comput. Sci..

[22]  Henry Tirri,et al.  A Bayesian Approach to Discretization , 1997 .

[23]  Rolf Wiehagen A Thesis in Inductive Inference , 1990, Nonmonotonic and Inductive Logic.

[24]  Tapio Elomaa,et al.  Necessary and Sufficient Pre-processing in Numerical Range Discretization , 2003, Knowledge and Information Systems.

[25]  Kenneth Falconer,et al.  Fractal Geometry: Mathematical Foundations and Applications , 1990 .

[26]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[27]  Thomas Zeugmann,et al.  Monotonic Versus Nonmonotonic Language Learning , 1991, Nonmonotonic and Inductive Logic.

[28]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[29]  Colin de la Higuera,et al.  Inference of omega-Languages from Prefixes , 2001, ALT.

[30]  Thomas Zeugmann,et al.  Characterizations of Monotonic and Dual Monotonic Language Learning , 1995, Inf. Comput..

[31]  Klaus Weihrauch,et al.  The Computable Multi-Functions on Multi-represented Sets are Closed under Programming , 2008, J. Univers. Comput. Sci..

[32]  Thomas Zeugmann,et al.  Learning Indexed Families of Recursive Languages from Positive Data by , 2007 .

[33]  Klaus P. Jantke Monotonic and non-monotonic inductive inference , 2009, New Generation Computing.

[34]  Colin de la Higuera,et al.  Inference of [omega]-languages from prefixes , 2004, Theor. Comput. Sci..

[35]  Loizos Michael Missing Information Impediments to Learnability , 2011, COLT.

[36]  Kwong-Sak Leung,et al.  Intelligent Data Engineering and Automated Learning — IDEAL 2000. Data Mining, Financial Engineering, and Intelligent Agents , 2002, Lecture Notes in Computer Science.

[37]  Takeshi Shinohara,et al.  The correct definition of finite elasticity: corrigendum to identification of unions , 1991, COLT '91.

[38]  Aggelos Kiayias,et al.  Malicious takeover of voting systems: arbitrary code execution on optical scan voting terminals , 2013, SAC '13.

[39]  Matthew de Brecht Topological and Algebraic Aspects of Algorithmic Learning Theory , 2010 .

[40]  Setsuo Arikawa,et al.  Criteria for inductive inference with mind changes and anomalies of recursive real-valued functions , 2003 .

[41]  Shai Ben-David,et al.  Learning with Restricted Focus of Attention , 1998, J. Comput. Syst. Sci..

[42]  Klaus Weihrauch,et al.  Computable Analysis: An Introduction , 2014, Texts in Theoretical Computer Science. An EATCS Series.

[43]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[44]  Peter Hertling,et al.  Computability & Complexity in Analysis , 2001 .

[45]  Jaakko Hollmén,et al.  Quantization of Continuous Input Variables for Binary Classification , 2000, IDEAL.

[46]  Peter H. Schmitt,et al.  Nonmonotonic and Inductive Logic , 1990, Lecture Notes in Computer Science.

[47]  Jean-Louis Krivine,et al.  Realizability algebras: a program to well order R , 2010, Log. Methods Comput. Sci..

[48]  S. Krantz Fractal geometry , 1989 .

[49]  Ehud Shapiro,et al.  Inductive Inference of Theories from Facts , 1991, Computational Logic - Essays in Honor of Alan Robinson.

[50]  Thomas Zeugmann,et al.  Characterization of language learning front informant under various monotonicity constraints , 1994, J. Exp. Theor. Artif. Intell..

[51]  Masako Sato,et al.  Refutable Language Learning with a Neighbor System , 2001, ALT.

[52]  Rolf Wiehagen,et al.  Learning Recursive Functions Refutably , 2001, ALT.

[53]  Huan Liu,et al.  Discretization: An Enabling Technique , 2002, Data Mining and Knowledge Discovery.

[54]  Carl H. Smith,et al.  On the Inductive Inference of Recursive Real-Valued Functions , 1999, Theor. Comput. Sci..

[55]  Charu C. Aggarwal,et al.  Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, DMKD 2003, San Diego, California, USA, June 13, 2003 , 2003, DMKD.

[56]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[57]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[58]  Norbert Th. Müller,et al.  The iRRAM: Exact Arithmetic in C++ , 2000, CCA.

[59]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[60]  Leslie G. Valiant,et al.  A general lower bound on the number of examples needed for learning , 1988, COLT '88.

[61]  J. R. Büchi On a Decision Method in Restricted Second Order Arithmetic , 1990 .

[62]  Dana Angluin,et al.  Inference of Reversible Languages , 1982, JACM.

[63]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[64]  Stephen Kwek,et al.  Learning from examples with unspecified attribute values , 2003, Inf. Comput..

[65]  Jens Blanck,et al.  Computability and complexity in analysis : 4th International Workshop, CCA 2000, Swansea, UK, September 17-19, 2000 : selected papers , 2001 .

[66]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[67]  Kouichi Hirata,et al.  Prediction of Recursive Real-Valued Functions from Finite Examples , 2005, JSAI Workshops.

[68]  Dan Roth,et al.  Learning to Reason with a Restricted View , 1995, COLT '95.

[69]  Setsuo Arikawa,et al.  Towards a Mathematical Theory of Machine Discovery from Facts , 1995, Theor. Comput. Sci..

[70]  Kouichi Hirata,et al.  Refutability and Reliability for Inductive Inference of Recursive Real-Valued Functions , 2005 .

[71]  Efim B. Kinber Monotonicity versus Efficiency for Learning Languages from Texts , 1994, AII/ALT.

[72]  S. Yau Mathematics and its applications , 2002 .

[73]  A. Turing On Computable Numbers, with an Application to the Entscheidungsproblem. , 1937 .

[74]  Carl H. Smith,et al.  On the Role of Procrastination in Machine Learning , 1993, Inf. Comput..

[75]  Colin DE LA HIGUERA,et al.  Inference of ω-Languages from Prefixes , 2004 .

[76]  H. Fédérer Geometric Measure Theory , 1969 .

[77]  Manuel Blum,et al.  Toward a Mathematical Theory of Inductive Inference , 1975, Inf. Control..

[78]  Ehud Shapiro,et al.  Algorithmic Program Debugging , 1983 .

[79]  D. C. Baird,et al.  Experimentation: An Introduction to Measurement Theory and Experiment Design , 1965 .

[80]  Bin Ma,et al.  The similarity metric , 2001, IEEE Transactions on Information Theory.

[81]  杉山 麿人,et al.  Learning figures with the Hausdorff metric by fractals (特集 「機械学習の諸科学への応用」および一般) , 2009 .

[82]  Sanjay Jain,et al.  Uncountable automatic classes and learning , 2009, Theor. Comput. Sci..

[83]  Sanjay Jain Hypothesis spaces for learning , 2011, Inf. Comput..

[84]  Thomas Zeugmann,et al.  Learning indexed families of recursive languages from positive data: A survey , 2008, Theor. Comput. Sci..

[85]  Loizos Michael Partial observability and learnability , 2010, Artif. Intell..

[86]  João Gama,et al.  Discretization from data streams: applications to histograms and data mining , 2006, SAC.

[87]  Gerald Beer,et al.  Topologies on Closed and Closed Convex Sets , 1993 .

[88]  Diplom-Informatiker Matthias Schroder,et al.  Admissible representations for continuous computations , 2002 .

[89]  Philip M. Long,et al.  PAC Learning Axis-Aligned Rectangles with Respect to Product Distributions from Multiple-Instance Examples , 1996, COLT.

[90]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[91]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[92]  Emilio Corchado,et al.  Intelligent Data Engineering and Automated Learning - IDEAL 2006, 7th International Conference, Burgos, Spain, September 20-23, 2006, Proceedings , 2006, IDEAL.

[93]  Benoit B. Mandelbrot,et al.  Fractal Geometry of Nature , 1984 .