Lattice Machine Classification based on Contextual Probability

In this paper we review Lattice Machine, a learning paradigm that “learns” by generalising data in a consistent, conservative and parsimonious way, and has the advantage of being able to provide additional reliability information for any classification. More specifically, we review the related concepts such as hyper tuple and hyper relation, the three generalising criteria equilabelledness, maximality, and supportedness as well as the modelling and classifying algorithms. In an attempt to find a better method for classification in Lattice Machine, we consider the contextual probability which was originally proposed as a measure for approximate reasoning when there is insufficient data. It was later found to be a probability function that has the same classification ability as the data generating probability called primary probability. It was also found to be an alternative way of estimating the primary probability without much model assumption. Consequently, a contextual probability based Bayes classifier can be designed. In this paper we present a new classifier that utilises the Lattice Machine model and generalises the contextual probability based Bayes classifier. We interpret the model as a dense set of data points in the data space and then apply the contextual probability based Bayes classifier. A theorem is presented that allows efficient estimation of the contextual probability based on this interpretation. The proposed classifier is illustrated by examples.

[1]  Andrzej Skowron,et al.  Hyperrelations in version space , 2004, Int. J. Approx. Reason..

[2]  David A. Bell,et al.  Extended k-Nearest Neighbours based on Evidence Theory , 2004, Comput. J..

[3]  Hui Wang,et al.  A Multidimensional Sequence Approach to Measuring Tree Similarity , 2012, IEEE Transactions on Knowledge and Data Engineering.

[4]  David J. Smith,et al.  How Small Is a Unit Ball , 1989 .

[5]  Bin Ma,et al.  On the similarity metric and the distance metric , 2009, Theor. Comput. Sci..

[6]  Werner Dubitzky,et al.  A flexible and robust similarity measure based on contextual probability , 2005, IJCAI.

[7]  David A. Bell,et al.  Data Reduction Based on Hyper Relations , 1998, KDD.

[8]  Ivo Düntsch,et al.  Classificatory filtering in decision systems , 2000, Int. J. Approx. Reason..

[9]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  Fionn Murtagh,et al.  A Study of the Neighborhood Counting Similarity , 2008, IEEE Transactions on Knowledge and Data Engineering.

[11]  Hui Wang A Novel Clustering Method Based on Spatial Operations , 2006, BNCOD.

[12]  Sally McClean,et al.  All common embedded subtrees for clustering XML documents by structure , 2009, 2009 International Conference on Machine Learning and Cybernetics.

[13]  Gordon Plotkin,et al.  A Note on Inductive Generalization , 2008 .

[14]  David A. Bell,et al.  A Lattice Machine Approach to Automated Casebase Design: Marrying Lazy and Eager Learning , 1999, IJCAI.

[15]  Hui Wang,et al.  Nearest neighbors by neighborhood counting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Hui Wang,et al.  All Common Subsequences , 2007, IJCAI.

[17]  Hui Wang,et al.  Kernels for acyclic digraphs , 2012, Pattern Recognit. Lett..