Adaptive prototype-based fuzzy classification

Classifying large datasets without any a priori information poses a problem especially in the field of bioinformatics. In this work, we explore the problem of classifying hundreds of thousands of cell assay images obtained by a high-throughput screening camera. The goal is to label a few selected examples by hand and to automatically label the rest of the images afterwards. Up to now, such images are classified by scripts and classification techniques that are designed to tackle a specific problem. We propose a new adaptive active clustering scheme, based on an initial fuzzy c-means clustering and learning vector quantization. This scheme can initially cluster large datasets unsupervised and then allows for adjustment of the classification by the user. Motivated by the concept of active learning, the learner tries to query the most ''useful'' examples in the learning process and therefore keeps the costs for supervision at a low level. A framework for the classification of cell assay images based on this technique is introduced. We compare our approach to other related techniques in this field based on several datasets.

[1]  Lei Wang,et al.  Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[2]  Arindam Banerjee,et al.  Active Semi-Supervision for Pairwise Constrained Clustering , 2004, SDM.

[3]  Gunnar Rätsch,et al.  Active Learning with Support Vector Machines in the Drug Discovery Process , 2003, J. Chem. Inf. Comput. Sci..

[4]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[5]  Bogdan Gabrys,et al.  Combining labelled and unlabelled data in the design of pattern classification systems , 2004, Int. J. Approx. Reason..

[6]  M. P. Windham Geometrical fuzzy clustering algorithms , 1983 .

[7]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[8]  R. M. Haralick,et al.  Textural features for image classification. IEEE Transaction on Systems, Man, and Cybernetics , 1973 .

[9]  von F. Zernike Beugungstheorie des schneidenver-fahrens und seiner verbesserten form, der phasenkontrastmethode , 1934 .

[10]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[11]  Rajesh N. Davé,et al.  Characterization and detection of noise in clustering , 1991, Pattern Recognit. Lett..

[12]  Georgios Dounias,et al.  Pap-smear Benchmark Data For Pattern Classification , 2005 .

[13]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[14]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[15]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[17]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[18]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[19]  Nozha Boujemaa,et al.  Active semi-supervised fuzzy clustering for image database categorization , 2005, MIR '05.

[20]  David B. Shmoys,et al.  A Best Possible Heuristic for the k-Center Problem , 1985, Math. Oper. Res..

[21]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[22]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[23]  M. P. Windham Cluster validity for fuzzy clustering algorithms , 1981 .

[24]  F. Zernike,et al.  Diffraction Theory of the Knife-Edge Test and its Improved Form, The Phase-Contrast Method , 1934 .

[25]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[26]  Polina Golland,et al.  Voronoi-Based Segmentation of Cells on Image Manifolds , 2005, CVBIA.

[27]  Lawrence O. Hall,et al.  Active learning to recognize multiple types of plankton , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[28]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.