Mining visual web knowledge utilizing multiple classifier architecture

Inspite of the huge amounts of image data on the web, mining image data from the web is paid less attention than mining text data, since treating the semantics of images is much more difficult. This paper introduces a new system to mine visual knowledge on the web that aims to build a Domain Oriented Image Directory by using the Earth Mover's Distance and Color signatures. Instead of using a flat classifier to combine text and image classification, the system suggests dividing the classification task into smaller classification problems corresponding to the branches in the classification hierarchy. Thus a multiple classifier system is presented. This paper illustrates the suggested system and discusses each of its components. Extensive experiments were conducted to test the system and also to compare it with commercial search engines. By the experiments we show that the proposed system accuracy outperforms the mostly used commercial search engines.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Keiji Yanai,et al.  Generic image classification using visual knowledge on the web , 2003, ACM Multimedia.

[3]  Keiji Yanai,et al.  Finding visual concepts by web image mining , 2006, WWW '06.

[4]  Keiji Yanai Image collector III: a web image-gathering system with bag-of-keypoints , 2007, WWW '07.

[5]  Michael J. Swain,et al.  WebSeer: An Image Search Engine for the World Wide Web , 1996 .

[6]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[7]  Remco C. Veltkamp,et al.  Approximation algorithms for the Earth mover's distance under transformations using reference points , 2005, EuroCG.

[8]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[9]  Djemel Ziou,et al.  Image Retrieval from the World Wide Web: Issues, Techniques, and Systems , 2004, CSUR.

[10]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[11]  Takahiro Hara,et al.  Image classification for mobile web browsing , 2006, WWW '06.

[12]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[13]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[14]  Lei Zhang,et al.  IGroup: presenting web image search results in semantic clusters , 2007, CHI.

[15]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  David G. Stork,et al.  Pattern Classification , 1973 .

[17]  Yi-Ming Yang,et al.  Hierarchical web image classification by multi-level features , 2002, Proceedings. International Conference on Machine Learning and Cybernetics.

[18]  Tsong Yueh Chen,et al.  On the statistical properties of the F-measure , 2004, Fourth International Conference onQuality Software, 2004. QSIC 2004. Proceedings..

[19]  Soosun Cho,et al.  Image Features for Machine Learning Based Web Image Classification , 2003, IS&T/SPIE Electronic Imaging.