论文信息 - Subspace-based Semantic Concept Detection and Retrieval for Multimedia Information Systems

Subspace-based Semantic Concept Detection and Retrieval for Multimedia Information Systems

of a dissertation at the University of Miami. Dissertation supervised by Professor Mei-Ling Shyu No. of pages in text. (170) The prevalence of digital recording devices, the cheap cost of data storage as well as the convenience provided by the widely accessible Internet have created the demand to retrieve information according to users’ requests from multimedia data sources. However, the multimedia information retrieval task has several challenges that need to be addressed, such as bridging the semantic gap, modeling from imbalanced data sets, and utilizing inter-concept relationships to enhance the retrieval performance of an individual concept. To respond to the challenge of bridging the semantic gap, subspace modeling methods are proposed to address this issue as a classification task. The proposed subspace modeling methods construct a principal component (PC) subspace for each class, where the PCs are derived from the instances belonging to that class. The PCs are selected and ranked based on Fisher’s criterion to reduce the searching effort and an iterative searching is utilized to determine the best PC set. Subspace modeling methods are proposed in this dissertation, including multi-class subspace modeling (MSM), binary-class subspace modeling (BSM), and subspace modeling on global and local structures (SMGL). Comparative experiments show that MSM, BSM, and SMGL can outperform some other well-known algorithms on a number of benchmark data sets. To address the data imbalance challenge, two clustering-based subspace modeling methods called clustering-based subspace modeling (CLU-SUMO) and class selection and clusteringbased subspace modeling (CSC-SUMO) are proposed. K-means clustering and/or semantic concept labels are used to partition the majority class (usually the negative class) into several groups, each of which is merged with the minority class (usually the positive class) to form a much more balanced subset of the original data set. Then, the subspace model learned from the original data set is integrated with all the subspace models learned from the balanced subsets to form a classification framework. The experimental results on news and broadcast video data sets support the claim that the proposed CLU-SUMO and CSC-SUMO render better classification performance than some existing techniques that are commonly used to handle the data imbalance

Chao Chen | Chao Chen

[1] Jianping Fan,et al. Mining Multilevel Image Semantics via Hierarchical Classification , 2008, IEEE Transactions on Multimedia.

[2] Min Chen,et al. An Effective Multi-concept Classifier for Video Streams , 2008, 2008 IEEE International Conference on Semantic Computing.

[3] David A. Forsyth,et al. Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[4] Alberto Maria Segre,et al. Programs for Machine Learning , 1994 .

[5] Yongdong Zhang,et al. Explicit and implicit concept-based video retrieval with bipartite graph propagation model , 2010, ACM Multimedia.

[6] Yixin Chen,et al. CLUE: cluster-based retrieval of images by unsupervised learning , 2005, IEEE Transactions on Image Processing.

[7] Johannes Fürnkranz,et al. Incremental Reduced Error Pruning , 1994, ICML.

[8] Chao Chen,et al. Multi-Class Classification via Subspace Modeling , 2011, Int. J. Semantic Comput..

[9] Haibo He,et al. ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[10] Tao Mei,et al. Refining video annotation by exploiting pairwise concurrent relation , 2007, ACM Multimedia.

[11] E. McDermott,et al. Minimum classification error via a Parzen window based estimate of the theoretical Bayes classification risk , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[12] Gregory N. Hullender,et al. Learning to rank using gradient descent , 2005, ICML.

[13] Aggelos K. Katsaggelos,et al. Video retrieval using sparse Bayesian reconstruction , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[14] Shu-Ching Chen,et al. Correlation-Based Video Semantic Concept Detection Using Multiple Correspondence Analysis , 2008, 2008 Tenth IEEE International Symposium on Multimedia.

[15] Nitesh V. Chawla,et al. SMOTEBoost: Improving Prediction of the Minority Class in Boosting , 2003, PKDD.

[16] Thomas M. Cover,et al. Estimation by the nearest neighbor rule , 1968, IEEE Trans. Inf. Theory.

[17] Chong-Wah Ngo,et al. Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[18] Michel Lubrano,et al. A minimum Hellinger distance estimator for stochastic differential equations: An application to statistical inference for continuous time interest rate models , 2008, Comput. Stat. Data Anal..

[19] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20] L. Rodney Long,et al. Bridging the semantic gap using Ranking SVM for image retrieval , 2009, 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro.

[21] Michael G. Madden,et al. The Genetic Kernel Support Vector Machine: Description and Evaluation , 2005, Artificial Intelligence Review.

[22] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[23] Chao Chen,et al. Clustering-based binary-class classification for imbalanced data sets , 2011, 2011 IEEE International Conference on Information Reuse & Integration.

[24] M. Maloof. Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown , 2003 .

[25] Robert E. Schapire,et al. Using output codes to boost multiclass learning problems , 1997, ICML.

[26] Chao Chen,et al. Within and Between Shot Information Utilisation in Video Key Frame Extraction , 2011, J. Inf. Knowl. Manag..

[27] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.

[28] Tom White,et al. Hadoop: The Definitive Guide , 2009 .

[29] Shu-Ching Chen,et al. Video semantic concept detection via associative classification , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[30] Rong Yan,et al. Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News , 2007, IEEE Transactions on Multimedia.

[31] John Langford,et al. Cost-sensitive learning by cost-proportionate example weighting , 2003, Third IEEE International Conference on Data Mining.

[32] Alexander J. Smola,et al. Support Vector Regression Machines , 1996, NIPS.

[33] C.-C. Jay Kuo,et al. Rule-based video classification system for basketball video indexing , 2000, MULTIMEDIA '00.

[34] Gary Weiss,et al. Does cost-sensitive learning beat sampling for classifying rare classes? , 2005, UBDM '05.

[35] Min Chen,et al. A decision tree-based multimodal data mining framework for soccer goal detection , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[36] Avinash C. Kak,et al. PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[37] Marcel Worring,et al. Multimedia event-based video indexing using time intervals , 2005, IEEE Transactions on Multimedia.

[38] David A. Cieslak,et al. Start Globally, Optimize Locally, Predict Globally: Improving Performance on Imbalanced Data , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[39] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[40] Narendra Ahuja,et al. Regression based bandwidth selection for segmentation using Parzen windows , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[41] Arnold W. M. Smeulders,et al. Real-Time Visual Concept Classification , 2010, IEEE Transactions on Multimedia.

[42] Harriet J. Nock,et al. Discriminative model fusion for semantic concept detection and annotation in video , 2003, ACM Multimedia.

[43] James Holmes. Struts: The Complete Reference, 2nd Edition , 2006 .

[44] Chin-Hui Lee,et al. A MFoM learning approach to robust multiclass multi-label text categorization , 2004, ICML.

[45] J. Mercer. Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[46] Paul Over,et al. Evaluation campaigns and TRECVid , 2006, MIR '06.

[47] Johannes Fürnkranz. A Tight Integration of Pruning and Learning , 1995 .

[48] James Ze Wang,et al. Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[49] William W. Cohen. Fast Effective Rule Induction , 1995, ICML.

[50] Shu-Ching Chen,et al. A Distributed Agent-Based Approach to Intrusion Detection Using the Lightweight PCC Anomaly Detection Classifier , 2006, SUTC.

[51] John R. Smith,et al. On the detection of semantic concepts at TRECVID , 2004, MULTIMEDIA '04.

[52] Mei-Ling Shyu,et al. Weighted Association Rule Mining for Video Semantic Detection , 2010, Int. J. Multim. Data Eng. Manag..

[53] Gary M. Weiss. Mining with rarity: a unifying framework , 2004, SKDD.

[54] Tao Mei,et al. Correlative multi-label video annotation , 2007, ACM Multimedia.

[55] Venkatesan Guruswami,et al. Multiclass learning, boosting, and error-correcting codes , 1999, COLT '99.

[56] Nicu Sebe,et al. Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[57] Usama M. Fayyad,et al. Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[58] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[59] Jiebo Luo,et al. Utilizing semantic word similarity measures for video retrieval , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[60] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[61] Andrew Zisserman,et al. Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[62] Rong Yan,et al. Mining Relationship Between Video Concepts using Probabilistic Graphical Models , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[63] Marcel Worring,et al. The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[64] Nathalie Japkowicz,et al. The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[65] Thomas S. Huang,et al. Image retrieval with relevance feedback: from heuristic weight adjustment to optimal learning methods , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[66] Jitendra Malik,et al. SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[67] Marcel Worring,et al. Building a visual ontology for video retrieval , 2005, MULTIMEDIA '05.

[68] David Haussler,et al. Using the Fisher Kernel Method to Detect Remote Protein Homologies , 1999, ISMB.

[69] Lukasz Kobyliński,et al. PIPS Image Classification with Customized Associative Classifiers ⋆ , 2006 .

[70] Chao Chen,et al. Weighted Subspace Filtering and Ranking Algorithms for Video Concept Retrieval , 2011, IEEE MultiMedia.

[71] Benjamin Bustos,et al. Visual-semantic graphs: using queries to reduce the semantic gap in web image retrieval , 2010, CIKM.

[72] Pascal Vincent,et al. K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms , 2001, NIPS.

[73] Chuan Wu,et al. Events recognition by semantic inference for sports video , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[74] Jian Pei,et al. Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[75] Nuno Vasconcelos,et al. A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications , 2003, NIPS.

[76] David Mease,et al. Boosted Classification Trees and Class Probability/Quantile Estimation , 2007, J. Mach. Learn. Res..

[77] J. Ross Quinlan,et al. Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[78] Min Chen,et al. DETECTION OF SOCCER GOAL SHOTS USING JOINT MULTIMEDIA FEATURES AND CLASSIFICATION RULES , 2003 .

[79] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[80] Chong-Wah Ngo,et al. Domain adaptive semantic diffusion for large scale context-based video annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[81] Daniel Riccio,et al. A New Data Normalization Function for Multibiometric Contexts: A Case Study , 2008, ICIAR.

[82] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[83] B. Veera Jyothi,et al. Neural Network approach for image retrieval based on preference elicitation , 2010 .

[84] Changhu Wang,et al. Image annotation refinement using random walk with restarts , 2006, MM '06.

[85] Robert E. Schapire,et al. Predicting Nearly As Well As the Best Pruning of a Decision Tree , 1995, COLT '95.

[86] R. Jenssen,et al. Indefinite Parzen Window for Spectral Clustering , 2007, 2007 IEEE Workshop on Machine Learning for Signal Processing.

[87] Hui Han,et al. Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[88] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[89] Halbert White,et al. Connectionist nonparametric regression: Multilayer feedforward networks can learn arbitrary mappings , 1990, Neural Networks.

[90] E. Parzen. On Estimation of a Probability Density Function and Mode , 1962 .

[91] Naixue Xiong,et al. Using Multi-Modal Semantic Association Rules to fuse keywords and visual features automatically for Web image retrieval , 2011, Inf. Fusion.

[92] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2003, ICTAI.

[93] Lior Rokach,et al. Data Mining And Knowledge Discovery Handbook , 2005 .

[94] Yoram Singer,et al. An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[95] Wei-Ying Ma,et al. Learning a semantic space from user's relevance feedback for image retrieval , 2003, IEEE Trans. Circuits Syst. Video Technol..

[96] Herna L. Viktor,et al. Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach , 2004, SKDD.

[97] Gustavo E. A. P. A. Batista,et al. A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[98] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[99] Zhi-Hua Zhou,et al. The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study , 2006, Sixth International Conference on Data Mining (ICDM'06).

[100] Shu-Ching Chen,et al. Collateral Representative Subspace Projection Modeling for Supervised Classification , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[101] Johan A. K. Suykens,et al. Advances in learning theory : methods, models and applications , 2003 .

[102] Rama Chellappa,et al. Principal components null space analysis for image and video classification , 2006, IEEE Transactions on Image Processing.

[103] Ming-Syan Chen,et al. Association and Temporal Rule Mining for Post-Filtering of Semantic Concept Detection in Video , 2008, IEEE Transactions on Multimedia.

[104] Thorsten Joachims,et al. Optimizing search engines using clickthrough data , 2002, KDD.

[105] Jianping Fan,et al. Automatic image annotation by using concept-sensitive salient objects for image content representation , 2004, SIGIR '04.

[106] Wei-Ying Ma,et al. Learning similarity measure for natural image retrieval with relevance feedback , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[107] HighWire Press. Philosophical Transactions of the Royal Society of London , 1781, The London Medical Journal.

[108] Andrea Kutics,et al. Linking images and keywords for semantics-based image retrieval , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).