Rough Set Based Clustering Using Active Learning Approach

This paper revisits the problem of active learning and decision making when the cost of labeling incurs cost and unlabeled data is available in abundance. In many real world applications large amounts of data are available but the cost of correctly labeling it prohibits its use. In such cases, active learning can be employed. In this paper the authors propose rough set based clustering using active learning approach. The authors extend the basic notion of Hamming distance to propose a dissimilarity measure which helps in finding the approximations of clusters in the given data set. The underlying theoretical background for this decision is rough set theory. The authors have investigated our algorithm on the benchmark data sets from UCI machine learning repository which have shown promising results.

[1]  Alexandre C. B. Delbem,et al.  Decomposition of Black-Box Optimization Problems by Community Detection in Bayesian Networks , 2012, Int. J. Nat. Comput. Res..

[2]  Joshua Zhexue Huang,et al.  Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.

[3]  Vijayan Sugumaran Intelligent support systems : knowledge management , 2002 .

[4]  Clarimar José Coelho,et al.  Multi-Objective Evolutionary Algorithm NSGA-II for Variables Selection in Multivariate Calibration Problems , 2012, Int. J. Nat. Comput. Res..

[5]  Asheesh K. Singh,et al.  Approximated Simplest Fuzzy Logic Controlled Shunt Active Power Filter for Current Harmonic Mitigation , 2011, Int. J. Fuzzy Syst. Appl..

[6]  A. Rubinov,et al.  Unsupervised and supervised data classification via nonsmooth and global optimization , 2003 .

[7]  Bin Wang A New Clustering Algorithm On Nominal Data Sets , 2010 .

[8]  Desheng Liu,et al.  Box-Counting Dimension of Fractal Urban Form: Stability Issues and Measurement Design , 2012, Int. J. Artif. Life Res..

[9]  Joshua Zhexue Huang,et al.  A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining , 1997, DMKD.

[10]  Guo-Cheng Yuan,et al.  Prediction of Epigenetic Target Sites by Using Genomic DNA Sequence , 2011, Handbook of Research on Computational and Systems Biology.

[11]  Yu Bin,et al.  An Clustering Algorithm Based on Rough Set , 2006, 2006 3rd International IEEE Conference Intelligent Systems.

[12]  Ying Zhang,et al.  Discretization Algorithms of Rough Sets Using Clustering , 2004, 2004 IEEE International Conference on Robotics and Biomimetics.

[13]  Takamichi Nakamoto,et al.  Human Olfactory Displays and Interfaces: Odor Sensing and Presentation , 2012 .

[14]  Vijayan Sugumaran Intelligent Information Technologies: Concepts, Methodologies, Tools and Applications , 2007 .

[15]  Zhang Yi,et al.  Clustering Categorical Data , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[16]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[17]  Jon M. Kleinberg,et al.  Clustering categorical data: an approach based on dynamical systems , 2000, The VLDB Journal.

[18]  Duo Chen,et al.  A Rough Set-Based Hierarchical Clustering Algorithm for Categorical Data , 2006 .

[19]  Licai Yang,et al.  Study of a Cluster Algorithm Based on Rough Sets Theory , 2006, Sixth International Conference on Intelligent Systems Design and Applications.

[20]  D. Ślęzak,et al.  Application of rough set based dynamic parameter optimization to MRI segmentation , 2004, IEEE Annual Meeting of the Fuzzy Information, 2004. Processing NAFIPS '04..

[21]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[22]  Rahul Singh,et al.  A Multi-Agent Decision Support Architecture for Knowledge Representation and Exchange , 2007, Int. J. Intell. Inf. Technol..

[23]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[24]  Witold Pedrycz,et al.  Rough–Fuzzy Collaborative Clustering , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[26]  Ke Wang,et al.  Clustering transactions using large items , 1999, CIKM '99.

[27]  Shusaku Tsumoto,et al.  On the Nature of Degree of Indiscerniblity for Rough Clustering , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[28]  Jesse Hoey,et al.  Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions , 2011 .