Clustering and Classification Algorithms in Food and Agricultural Applications: A Survey

Data mining has become an important tool for information analysis in many disciplines. Data clustering, also known as unsupervised classification, is a popular data-mining technique. Clustering is a very challenging task because of little or no prior knowledge. Literature review reveals researchers’ interest in development of efficient clustering algorithms and their application to a variety of real-life situations. This chapter presents fundamental concepts of widely used classification algorithms including k-means, k-nearest neighbor, artificial neural networks, and fuzzy c-means. We also discuss applications of these algorithms in food and agriculture sciences including fruits classification, machine vision, wine classification, and analysis of remotely sensed forest images.

[1]  P. Atkinson,et al.  Introduction Neural networks in remote sensing , 1997 .

[2]  Venu Govindaraju,et al.  Improved k-nearest neighbor classification , 2002, Pattern Recognit..

[3]  John A. Marchant,et al.  Comparison of a Bayesian classifier with a multilayer feed-forward neural network using the example of plant/weed/soil discrimination , 2003 .

[4]  Panos M. Pardalos,et al.  Feature Selection for Consistent Biclustering via Fractional 0–1 Programming , 2005, J. Comb. Optim..

[5]  S. Prasher,et al.  Artificial neural networks to predict corn yield from Compact Airborne Spectrographic Imager data , 2005 .

[6]  Jiali Mao,et al.  The Study of Parallel K-Means Algorithm , 2006, 2006 6th World Congress on Intelligent Control and Automation.

[7]  Shixin Cheng,et al.  Dynamic learning rate optimization of the backpropagation algorithm , 1995, IEEE Trans. Neural Networks.

[8]  Nicolaos B. Karayiannis,et al.  A methodology for constructing fuzzy algorithms for learning vector quantization , 1997, IEEE Trans. Neural Networks.

[9]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Filiberto Pla,et al.  Vision systems for the location of citrus fruit in a tree canopy , 1992 .

[11]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Computing k-Nearest Neighbors , 1975, IEEE Transactions on Computers.

[12]  M. Nilsson,et al.  Applications using estimates of forest parameters derived from satellite and forest inventory data , 2002 .

[13]  Palma Blonda,et al.  A survey of fuzzy clustering algorithms for pattern recognition. I , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[14]  Stefano Benati Categorical data fuzzy clustering: An analysis of local search heuristics , 2008, Comput. Oper. Res..

[15]  Chris T. Kiranoudis,et al.  Classification of aged wine distillates using fuzzy and neural network systems , 2000 .

[16]  Klaus Danzer,et al.  Classification of wine samples by means of artificial neural networks and discrimination analytical methods , 1997 .

[17]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[18]  Giorgio Grasso,et al.  Localization of spherical fruits for robotic harvesting , 2001, Machine Vision and Applications.

[19]  Chuanyi Ji,et al.  Combinations of Weak Classifiers , 1996, NIPS.

[20]  Anil K. Jain,et al.  Multiobjective data clustering , 2004, CVPR 2004.

[21]  Sergio Bermejo,et al.  Adaptive soft k-nearest-neighbour classifiers , 2000, Pattern Recognit..

[22]  Y. Hashimoto,et al.  Pattern recognition of fruit shape based on the concept of chaos and neural networks , 2000 .

[23]  R. Yager,et al.  Approximate Clustering Via the Mountain Method , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[24]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[25]  Kazuhiro Nakano,et al.  Application of neural networks to the color grading of apples , 1997 .

[26]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Don-Lin Yang,et al.  An efficient Fuzzy C-Means clustering algorithm , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[28]  Norbert Jakubowski,et al.  Classification of German white wines with certified brand of origin by multielement quantitation and pattern recognition techniques. , 2004, Journal of agricultural and food chemistry.

[29]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[30]  Aurelio Uncini,et al.  Regularising neural networks using flexible multivariate activation function , 2004, Neural Networks.

[31]  Tatiana Mora Kuplich,et al.  Classifying regenerating forest stages in Amazônia using remotely sensed images and a neural network , 2006 .

[32]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[33]  Maria J. Diamantopoulou,et al.  Artificial neural networks as an alternative tool in pine bark volume estimation , 2005 .

[34]  Ludmila I. Kuncheva,et al.  Fitness functions in editing k-NN reference set by genetic algorithms , 1997, Pattern Recognit..

[35]  Won Suk Lee,et al.  Color Vision System for Estimating Citrus Yield in Real-time , 2004 .

[36]  Ronald R. Yager,et al.  An extension of the naive Bayesian classifier , 2006, Inf. Sci..

[37]  Shi Zhong,et al.  Efficient online spherical k-means clustering , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[38]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[39]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[40]  F. Moura-Pires,et al.  A fuzzy clustering model of data and fuzzy c-means , 2000, Ninth IEEE International Conference on Fuzzy Systems. FUZZ- IEEE 2000 (Cat. No.00CH37063).

[41]  W. S. Lee,et al.  Identification of citrus disease using color texture features and discriminant analysis , 2006 .

[42]  Kuo-Liang Chung,et al.  Faster and more robust point symmetry-based K-means algorithm , 2007, Pattern Recognit..

[43]  Guoqiang Peter Zhang,et al.  Neural networks for classification: a survey , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[44]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[45]  Wen-Jyi Hwang,et al.  Fast kNN classification algorithm based on partial distance search , 1998 .

[46]  E. A. Parrish,et al.  Pictorial Pattern Recognition Applied to Fruit Harvesting , 1977 .

[47]  Digvir S. Jayas,et al.  Multi-layer neural networks for image analysis of agricultural products , 2000 .

[48]  T. Marique,et al.  Modeling of Fried Potato Chips Color Classification using Image Analysis and Artificial Neural Network , 2003 .

[49]  Bruce W. Schmeiser,et al.  Improving model accuracy using optimal linear combinations of trained neural networks , 1995, IEEE Trans. Neural Networks.

[50]  James C. Bezdek,et al.  Generalized fuzzy c-means clustering strategies using Lp norm distances , 2000, IEEE Trans. Fuzzy Syst..

[51]  H. L. Le Roy,et al.  Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; Vol. IV , 1969 .

[52]  Reginald E. Hammah,et al.  Validity Measures for the Fuzzy Cluster Analysis of Orientations , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  Michael J. Laszlo,et al.  A genetic algorithm using hyper-quadtrees for low-dimensional k-means clustering , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[55]  Takehiko Hoshi,et al.  A daily harvest prediction model of cherry tomatoes by mining from past averaging data and using topological case-based modeling , 2000 .

[56]  Göran Ståhl,et al.  Forecasted reference sample plot data in estimations of stem volume using satellite spectral data and the k NN method , 2002 .

[57]  Won Suk Lee,et al.  STATISTICAL AND NEURAL NETWORK CLASSIFIERS FOR CITRUS DISEASE DETECTION USING MACHINE VISION , 2005 .

[58]  A. S. Fathinul-Syahir,et al.  Discrimination and classification of fresh-cut starfruits (Averrhoa carambola L.) using automated machine vision system , 2006 .

[59]  Jeng-Shyang Pan,et al.  Improved K nearest neighbor classification algorithm , 2004, The 2004 IEEE Asia-Pacific Conference on Circuits and Systems, 2004. Proceedings..

[60]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[61]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[62]  Panos M. Pardalos,et al.  Mining market data: A network approach , 2006, Comput. Oper. Res..

[63]  Ernest W. Tollner,et al.  AE—Automation and Emerging Technologies: Artificial Intelligence Classifiers for sorting Apples based on Watercore , 2001 .

[64]  James C. Bezdek,et al.  Fuzzy c-means clustering of incomplete data , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[65]  N. S. Visen,et al.  AE—Automation and Emerging Technologies: Evaluation of Neural Network Architectures for Cereal Grain Classification using Morphological Features , 2001 .

[66]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[67]  B. Sanwal,et al.  MODERN METHODS OF PLANT ANALYSIS , 1955 .

[68]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[69]  S. A. Salah,et al.  Feature extraction and classification of Chilean wines , 2006 .

[70]  David C. Slaughter,et al.  Color Vision in Robotic Fruit Harvesting , 1987 .

[71]  Franz Pernkopf,et al.  Bayesian network classifiers versus selective k-NN classifier , 2005, Pattern Recognit..

[72]  Haruhiko Murase,et al.  Machine vision based quality evaluation of Iyokan orange fruit using neural networks , 2000 .

[73]  Mike James,et al.  Classification Algorithms , 1986, Encyclopedia of Machine Learning and Data Mining.

[74]  Xiukun Yang,et al.  Use of genetic artificial neural networks and spectral imaging for defect detection on cherries , 2000 .

[75]  James C. Bezdek,et al.  On cluster validity for the fuzzy c-means model , 1995, IEEE Trans. Fuzzy Syst..

[76]  W. Simonton,et al.  Orientation Independent Machine Vision Classification of Plant Parts , 1993 .

[77]  Hamit Köksel,et al.  A classification system for beans using computer vision system and artificial neural networks , 2007 .

[78]  Pierre Hansen,et al.  J-MEANS: a new local search heuristic for minimum sum of squares clustering , 1999, Pattern Recognit..

[79]  D. Jayas,et al.  Classification of Bulk Samples of Cereal Grains using Machine Vision , 1999 .

[80]  Pablo M. Granitto,et al.  Large-scale investigation of weed seed identification by machine vision , 2005 .

[81]  David C. Slaughter,et al.  Discriminating Fruit for Robotic Harvest Using Color in Natural Outdoor Scenes , 1989 .

[82]  Sanjeev R. Kulkarni,et al.  Learning Pattern Classification - A Survey , 1998, IEEE Trans. Inf. Theory.

[83]  D. Estevez,et al.  Application of multivariate analysis and artificial neural networks for the differentiation of red wines from the Canary Islands according to the island of origin. , 2003, Journal of agricultural and food chemistry.

[84]  Xudong Jiang,et al.  Constructing and training feed-forward neural networks for pattern classification , 2003, Pattern Recognit..

[85]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[86]  Radnaabazar Chinchuluun Citrus Yield Mapping System in Natural Outdoor Scenes using the Watershed Transform , 2006 .

[87]  Jeng-Shyang Pan,et al.  A Fast K Nearest Neighbors Classification Algorithm , 2004 .

[88]  M. Narasimha Murty,et al.  Genetic K-means algorithm , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[89]  B. Magnuson,et al.  Determining the geographic origin of potatoes with trace metal analysis using statistical and neural network classifiers. , 1999, Journal of Agricultural and Food Chemistry.

[90]  Vincent Leemans,et al.  A real-time grading method of apples based on features extracted from defects , 2004 .

[91]  John F. Kolen,et al.  Reducing the time complexity of the fuzzy c-means algorithm , 2002, IEEE Trans. Fuzzy Syst..

[92]  Chien-Hsing Chou,et al.  Short Papers , 2001 .

[93]  N. Mort,et al.  A neural network system for the protection of citrus crops from frost damage , 1997 .

[94]  Jiang-She Zhang,et al.  Improved possibilistic C-means clustering algorithms , 2004, IEEE Trans. Fuzzy Syst..