Integrating deep learning with correlation-based multimedia semantic concept detection

OF THE DISSERTATION INTEGRATING DEEP LEARNING WITH CORRELATION-BASED MULTIMEDIA SEMANTIC CONCEPT DETECTION by Hsin-Yu Ha Florida International University, 2015 Miami, Florida Professor Shu-Ching Chen, Major Professor The rapid advances in technologies make the explosive growth of multimedia data possible and available to the public. Multimedia data can be defined as data collection, which is composed of various data types and different representations. Due to the fact that multimedia data carries knowledgeable information, it has been widely adopted to different genera, like surveillance event detection, medical abnormality detection, and many others. To fulfill various requirements for different applications, it is important to effectively classify multimedia data into semantic concepts across multiple domains. In this dissertation, a correlation-based multimedia semantic concept detection framework is seamlessly integrated with the deep learning technique. The framework aims to explore implicit and explicit correlations among features and concepts while adopting different Convolutional Neural Network (CNN) architectures accordingly. First, the Feature Correlation Maximum Spanning Tree (FC-MST) is proposed to remove the redundant and irrelevant features based on the correlations between the features and positive concepts. FC-MST identifies the effective features and decides the initial layer’s dimension in CNNs. Second, the Negative-based Sampling method is proposed to alleviate the data imbalance issue by keeping only the representative negative instances in the training process. To adjust different sizes of training data, the number of iterations for the CNN

[1]  Yongzhao Zhan,et al.  The retrieval of motion event by associations of temporal frequent pattern growth , 2013, Future Gener. Comput. Syst..

[2]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[3]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[4]  Angeliki Metallinou,et al.  Decision level combination of multiple modalities for recognition and analysis of emotional expression , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Chengcui Zhang,et al.  A Dynamic User Concept Pattern Learning Framework for Content-Based Image Retrieval , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[6]  Chao Chen,et al.  Web media semantic concept retrieval via tag removal and model fusion , 2013, ACM Trans. Intell. Syst. Technol..

[7]  Mei-Ling Shyu,et al.  Temporal Multiple Correspondence Analysis for Big Data Mining in Soccer Videos , 2015, 2015 IEEE International Conference on Multimedia Big Data.

[8]  Shu-Ching Chen,et al.  Network intrusion detection through Adaptive Sub-Eigenspace Modeling in multiagent systems , 2007, ACM Trans. Auton. Adapt. Syst..

[9]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[10]  Zenglin Xu,et al.  Discriminative Semi-Supervised Feature Selection Via Manifold Regularization , 2009, IEEE Transactions on Neural Networks.

[11]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[12]  Guohua Geng,et al.  Linear Transformation Technology for Image Feature Drop Dimension , 2011, 2011 Fourth International Symposium on Knowledge Acquisition and Modeling.

[13]  Usama M. Fayyad,et al.  On the Handling of Continuous-Valued Attributes in Decision Tree Generation , 1992, Machine Learning.

[14]  Shawn McCann,et al.  Object Detection using Convolutional Neural Networks , 2013 .

[15]  MansoorizadehMuharram,et al.  Multimodal information fusion application to human emotion recognition from face and speech , 2010 .

[16]  Shu-Ching Chen,et al.  Large-Scale Correlation- Based Semantic Classification Using MapReduce , 2014, Cloud Computing and Digital Media.

[17]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[18]  Shu-Ching Chen,et al.  A Visual Analytics Multimedia Mobile System for Emergency Response , 2011, 2011 IEEE International Symposium on Multimedia.

[19]  Shu-Ching Chen,et al.  Hierarchical disaster image classification for situation report enhancement , 2011, 2011 IEEE International Conference on Information Reuse & Integration.

[20]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[21]  Mei-Ling Shyu,et al.  Effective and Efficient Video High-Level Semantic Retrieval Using Associations and Correlations , 2009, Int. J. Semantic Comput..

[22]  Sukhendu Das,et al.  A Survey of Decision Fusion and Feature Fusion Strategies for Pattern Classification , 2010, IETE Technical Review.

[23]  Choochart Haruechaiyasak,et al.  MINING ASSOCIATION RULES WITH UNCERTAIN ITEM RELATIONSHIPS , 2002 .

[24]  Bakkama Srinath Reddy,et al.  Evidential Reasoning for Multimodal Fusion in Human Computer Interaction , 2007 .

[25]  Mohan S. Kankanhalli,et al.  Multimodal fusion for multimedia analysis: a survey , 2010, Multimedia Systems.

[26]  Gang Wu,et al.  Multispectral Palmprint Recognition by Feature Level Fusion , 2012 .

[27]  Anil K. Jain,et al.  Multibiometric Cryptosystems Based on Feature-Level Fusion , 2012, IEEE Transactions on Information Forensics and Security.

[28]  Dong Han,et al.  Multispectral palmprint recognition using wavelet-based image fusion , 2008, 2008 9th International Conference on Signal Processing.

[29]  Alan F. Smeaton,et al.  A Comparison of Score, Rank and Probability-Based Fusion Methods for Video Shot Retrieval , 2005, CIVR.

[30]  Petros Maragos,et al.  Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[31]  Yun Fu,et al.  An audio-visual fusion framework with joint dimensionality reducton , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[32]  Shu-Ching Chen,et al.  Building multi-model collaboration in detecting multimedia semantic concepts (invited paper) , 2013, 9th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing.

[33]  Zhiqiang Zhang,et al.  A Novel Hierarchical Information Fusion Method for Three-Dimensional Upper Limb Motion Estimation , 2011, IEEE Transactions on Instrumentation and Measurement.

[34]  Dong Liu,et al.  Joint audio-visual bi-modal codewords for video event detection , 2012, ICMR.

[35]  Yimin Zhu,et al.  Constraint driven model using correlation and collaborative filtering for sustainable building , 2012, 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI).

[36]  Stéphane Ayache,et al.  Video Corpus Annotation Using Active Learning , 2008, ECIR.

[37]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Xiuqi Li,et al.  An effective content-based visual image retrieval system , 2002, Proceedings 26th Annual International Computer Software and Applications.

[39]  Hongfei Lin,et al.  A two-stage feature selection method for text categorization , 2010, 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery.

[40]  Lars Petersson,et al.  Large scale sign detection using HOG feature variants , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[41]  Trevor Darrell,et al.  Sparselet Models for Efficient Multiclass Object Detection , 2012, ECCV.

[42]  Marina Bosch,et al.  ImageCLEF, Experimental Evaluation in Visual Information Retrieval , 2010 .

[43]  Rohini K. Srihari,et al.  Feature selection for text categorization on imbalanced data , 2004, SKDD.

[44]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Min Chen,et al.  Video Semantic Event/Concept Detection Using a Subspace-Based Multimedia Data Mining Framework , 2008, IEEE Transactions on Multimedia.

[46]  Wei Liu,et al.  Double Fusion for Multimedia Event Detection , 2012, MMM.

[47]  Rangasami L. Kashyap,et al.  Augmented transition networks as video browsing models for multimedia databases and multimedia information systems , 1999, Proceedings 11th International Conference on Tools with Artificial Intelligence.

[48]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[49]  Mei-Ling Shyu,et al.  Correlation maximisation-based discretisation for supervised classification , 2012, Int. J. Bus. Intell. Data Min..

[50]  Nasrollah Moghaddam Charkari,et al.  Multimodal information fusion application to human emotion recognition from face and speech , 2010, Multimedia Tools and Applications.

[51]  Yuxiao Hu,et al.  Audio-Visual Spontaneous Emotion Recognition , 2007, Artifical Intelligence for Human Computing.

[52]  Min Chen,et al.  A unified framework for image database clustering and content-based retrieval , 2004, MMDB '04.

[53]  Shu-Ching Chen,et al.  Correlation-Based Video Semantic Concept Detection Using Multiple Correspondence Analysis , 2008, 2008 Tenth IEEE International Symposium on Multimedia.

[54]  Koichi Shinoda,et al.  A Fast and Accurate Video Semantic-Indexing System Using Fast MAP Adaptation and GMM Supervectors , 2012, IEEE Transactions on Multimedia.

[55]  Mei-Ling Shyu,et al.  Effective Feature Space Reduction with Imbalanced Data for Semantic Concept Detection , 2008, 2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing (sutc 2008).

[56]  John R. Smith,et al.  Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[57]  Pornchai Phukpattaranont,et al.  Feature reduction and selection for EMG signal classification , 2012, Expert Syst. Appl..

[58]  Milind R. Naphade,et al.  Learning the semantics of multimedia queries and concepts from a small number of examples , 2005, MULTIMEDIA '05.

[59]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  William R. Hersh,et al.  Medical Image Retrieval and Automated Annotation: OHSU at ImageCLEF 2006 , 2006, CLEF.

[61]  Gabriela Csurka,et al.  Semantic combination of textual and visual information in multimedia retrieval , 2011, ICMR.

[62]  Haohong Wang,et al.  VideoTopic: Modeling User Interests for Content-Based Video Recommendation , 2014, Int. J. Multim. Data Eng. Manag..

[63]  Francisco Herrera,et al.  SMOTE-RSB*: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory , 2012, Knowledge and Information Systems.

[64]  Shu-Ching Chen,et al.  A web-based task-tracking collaboration system for the Florida Public Hurricane Loss Model , 2014, 10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing.

[65]  David A. Clausi,et al.  Design-based texture feature fusion using Gabor filters and co-occurrence probabilities , 2005, IEEE Transactions on Image Processing.

[66]  Shu-Ching Chen,et al.  Multimedia Databases and Data Management: A Survey , 2010, Int. J. Multim. Data Eng. Manag..

[67]  Le Song,et al.  Feature Selection via Dependence Maximization , 2012, J. Mach. Learn. Res..

[68]  Rangasami L. Kashyap,et al.  Indexing and searching structure for multimedia database systems , 1999, Electronic Imaging.

[69]  Xuelong Li,et al.  Image Annotation by Multiple-Instance Learning With Discriminative Feature Mapping and Selection , 2014, IEEE Transactions on Cybernetics.

[70]  Shu-Ching Chen,et al.  Association rule mining with a correlation-based interestingness measure for video semantic concept detection , 2012, Int. J. Inf. Decis. Sci..

[71]  Björn W. Schuller,et al.  Speaker Independent Speech Emotion Recognition by Ensemble Classification , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[72]  Xiangji Huang,et al.  Using Semantic-Based Association Rule Mining for Improving Clinical Text Retrieval , 2013, HIS.

[73]  Jaideep Srivastava,et al.  Indirect Association: Mining Higher Order Dependencies in Data , 2000, PKDD.

[74]  Shu-Ching Chen,et al.  Utilizing Indirect Associations in Multimedia Semantic Retrieval , 2015, 2015 IEEE International Conference on Multimedia Big Data.

[75]  Christopher Chute,et al.  The Diverse and Exploding Digital Universe , 2011 .

[76]  Francisco Herrera,et al.  EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling , 2013, Pattern Recognit..

[77]  Shu-Ching Chen,et al.  Effective supervised discretization for classification based on correlation maximization , 2011, 2011 IEEE International Conference on Information Reuse & Integration.

[78]  Rong Yan,et al.  Cross-domain video concept detection using adaptive svms , 2007, ACM Multimedia.

[79]  Shu-Ching Chen,et al.  Tree Animation for A 3D Interactive Visualization System For Hurricane Impacts , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[80]  Shu-Ching Chen,et al.  Negative-Based Sampling for Multimedia Retrieval , 2015, 2015 IEEE International Conference on Information Reuse and Integration.

[81]  Umer Rashid,et al.  Fusion of Multimedia Document Intra-Modality Relevancies using Linear Combination Model , 2008, SCSS.

[82]  Shu-Ching Chen,et al.  Disaster Image Filtering and Summarization Based on Multi-layered Affinity Propagation , 2012, 2012 IEEE International Symposium on Multimedia.

[83]  Emmanuel Dellandréa,et al.  Visual object categorization based on the fusion of region and local features , 2010, Stud. Inform. Univ..

[84]  Shu-Ching Chen,et al.  Methods and Innovations for Multimedia Database Content Management , 2012 .

[85]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[86]  Yimin Zhu,et al.  Integration of Video Image Processing and BIM-based Energy Simulation for Analyzing the Impact of Dynamic User Patterns on Building Energy Consumption , 2014 .

[87]  Muhammad Hussain,et al.  Feature Subset Selection for Network Intrusion Detection Mechanism Using Genetic Eigen Vectors , .

[88]  Yang Liu,et al.  Enhancing Multimedia Semantic Concept Mining and Retrieval by Incorporating Negative Correlations , 2014, 2014 IEEE International Conference on Semantic Computing.

[89]  Nan Zhang,et al.  Kernel feature selection to fuse multi-spectral MRI images for brain tumor segmentation , 2011, Comput. Vis. Image Underst..

[90]  Gede Putra Kusuma,et al.  Recombination of 2D and 3D Images for Multimodal 2D + 3D Face Recognition , 2010, 2010 Fourth Pacific-Rim Symposium on Image and Video Technology.

[91]  Rangasami L. Kashyap,et al.  Semantic Models for Multimedia Database Searching and Browsing , 2000, Advances in Database Systems.

[92]  Haohong Wang,et al.  An Automatic Object Retrieval Framework for Complex Background , 2013, 2013 IEEE International Symposium on Multimedia.

[93]  Mei-Ling Shyu,et al.  Automatic annotation of drosophila developmental stages using association classification and information integration , 2011, 2011 IEEE International Conference on Information Reuse & Integration.

[94]  Bir Bhanu,et al.  Tracking Humans using Multi-modal Fusion , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[95]  Mubarak Shah,et al.  Complex Events Detection Using Data-Driven Concepts , 2012, ECCV.

[96]  Choochart Haruechaiyasak,et al.  Mining user access behavior on the WWW , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).

[97]  Min Chen,et al.  Hierarchical Temporal Association Mining for Video Event Detection in Video Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering Workshop.

[98]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[99]  Shu-Ching Chen,et al.  Ensemble Learning from Imbalanced Data Set for Video Event Detection , 2015, 2015 IEEE International Conference on Information Reuse and Integration.

[100]  Xin Yao,et al.  MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning , 2014 .

[101]  Choochart Haruechaiyasak,et al.  A data mining framework for building a Web-page recommender system , 2004, Proceedings of the 2004 IEEE International Conference on Information Reuse and Integration, 2004. IRI 2004..

[102]  Shu-Ching Chen,et al.  Wavelet Analysis in Current Cancer Genome Research: A Survey , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[103]  Jianping Fan,et al.  Evidence-based SVM fusion for 3D model retrieval , 2013, Multimedia Tools and Applications.

[104]  Harun Uguz,et al.  A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm , 2011, Knowl. Based Syst..

[105]  Yi Deng,et al.  Towards a business continuity information network for rapid disaster recovery , 2008, DG.O.

[106]  Min Chen,et al.  A multimodal data mining framework for soccer goal detection based on decision tree logic , 2006, Int. J. Comput. Appl. Technol..

[107]  Paris Smaragdis,et al.  AUDIO/VISUAL INDEPENDENT COMPONENTS , 2003 .

[108]  Xuelong Li,et al.  Discriminative optical flow tensor for video semantic analysis , 2009, Comput. Vis. Image Underst..

[109]  Shu-Ching Chen,et al.  Enhancing Concept Detection by Pruning Data with MCA-Based Transaction Weights , 2009, 2009 11th IEEE International Symposium on Multimedia.

[110]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[111]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[112]  A. Murat Tekalp,et al.  Audiovisual Synchronization and Fusion Using Canonical Correlation Analysis , 2007, IEEE Transactions on Multimedia.

[113]  Rangasami L. Kashyap,et al.  Augmented Transition Network as a Semantic Model for Video Data , 2001 .

[114]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[115]  David Zhang,et al.  When Faces Are Combined with Palmprints: A Novel Biometric Fusion Strategy , 2004, ICBA.

[116]  Anthony K. H. Tung,et al.  Multiple feature fusion for social media applications , 2010, SIGMOD Conference.

[117]  Mei-Ling Shyu,et al.  Discriminative learning- assisted video semantic concept classification , 2017 .

[118]  Hsin-Yu Ha,et al.  A Multimedia Semantic Retrieval Mobile System Based on HCFGs , 2014, IEEE MultiMedia.

[119]  Chong-Wah Ngo,et al.  Fast Semantic Diffusion for Large-Scale Context-Based Image and Video Annotation , 2012, IEEE Transactions on Image Processing.

[120]  Shu-Ching Chen,et al.  Correlation-based Feature Analysis and Multi-Modality Fusion framework for multimedia semantic retrieval , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[121]  Alexander G. Hauptmann,et al.  Leveraging high-level and low-level features for multimedia event detection , 2012, ACM Multimedia.

[122]  Jianchu Kang,et al.  A comparative study on unsupervised feature selection methods for text clustering , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[123]  Chengcui Zhang,et al.  Innovative Shot Boundary Detection for Video Indexing , 2005 .

[124]  Hossein Mobahi,et al.  Deep learning from temporal coherence in video , 2009, ICML '09.

[125]  Stéphane Marchand-Maillet,et al.  Information Fusion in Multimedia Information Retrieval , 2007, Adaptive Multimedia Retrieval.

[126]  SaltonGerard,et al.  Term-weighting approaches in automatic text retrieval , 1988 .

[127]  R. Nagaraj,et al.  Anomaly Detection via Online Oversampling Principal Component Analysis , 2014 .

[128]  Haohong Wang,et al.  VideoTopic: Content-Based Video Recommendation Using a Topic Model , 2013, 2013 IEEE International Symposium on Multimedia.

[129]  Mubarak Shah,et al.  Columbia-UCF TRECVID2010 Multimedia Event Detection: Combining Multiple Modalities, Contextual Concepts, and Temporal Matching , 2010, TRECVID.

[130]  J. Miao,et al.  Oversampling smoothness: an effective algorithm for phase retrieval of noisy diffraction intensities. , 2012, Journal of applied crystallography.

[131]  Jing Zhao,et al.  ACOSampling: An ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data , 2013, Neurocomputing.

[132]  Mei-Ling Shyu,et al.  Weighted Association Rule Mining for Video Semantic Detection , 2010, Int. J. Multim. Data Eng. Manag..

[133]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[134]  Rangasami L. Kashyap,et al.  Identifying Overlapped Objects for Video Indexing and Modeling in Multimedia Database Systems , 2001, Int. J. Artif. Intell. Tools.

[135]  Shu-Ching Chen,et al.  Video Semantic Concept Discovery using Multimodal-Based Association Classification , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[136]  Min Chen,et al.  Deep Learning with MCA-based Instance Selection and Bootstrapping for Imbalanced Data Classification , 2015, 2015 IEEE Conference on Collaboration and Internet Computing (CIC).

[137]  Shu-Ching Chen,et al.  A distributed agent-based approach to intrusion detection using the lightweight PCC anomaly detection classifier , 2006, IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing (SUTC'06).

[138]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[139]  Zi Huang,et al.  Multi-Feature Fusion via Hierarchical Regression for Multimedia Analysis , 2013, IEEE Transactions on Multimedia.

[140]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[141]  Min Chen,et al.  Florida public hurricane loss model: Research in multi-disciplinary system integration assisting government policy making , 2009, Gov. Inf. Q..

[142]  Chao Chen,et al.  Weighted Subspace Filtering and Ranking Algorithms for Video Concept Retrieval , 2011, IEEE MultiMedia.

[143]  Chong-Wah Ngo,et al.  Concept-Driven Multi-Modality Fusion for Video Search , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[144]  Min Chen,et al.  A Multiple Instance Learning Approach for Content Based Image Retrieval Using One-Class Support Vector Machine , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[145]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[146]  Ming-Syan Chen,et al.  Association and Temporal Rule Mining for Post-Filtering of Semantic Concept Detection in Video , 2008, IEEE Transactions on Multimedia.

[147]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[148]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[149]  Bernhard Schölkopf,et al.  Use of the Zero-Norm with Linear Models and Kernel Methods , 2003, J. Mach. Learn. Res..

[150]  Gregory A. Clark,et al.  Sensor feature fusion for detecting buried objects , 1993, Defense, Security, and Sensing.

[151]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[152]  Hatice Gunes,et al.  Affect recognition from face and body: early fusion vs. late fusion , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[153]  Sandy L. Klemm,et al.  Single-Cell Expression Analyses during Cellular Reprogramming Reveal an Early Stochastic and a Late Hierarchic Phase , 2012, Cell.

[154]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[155]  Cees G. M. Snoek,et al.  The MediaMill at TRECVID 2013: : Searching concepts, Objects, Instances and events in video , 2013, TRECVID.

[156]  Alberto Del Bimbo,et al.  Event detection and recognition for semantic annotation of video , 2010, Multimedia Tools and Applications.

[157]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[158]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[159]  Liang Tang,et al.  Using data mining techniques to address critical information exchange needs in disaster affected public-private networks , 2010, KDD.

[160]  Yuichi Motai,et al.  Human tracking from a mobile agent: Optical flow and Kalman filter arbitration , 2012, Signal Process. Image Commun..

[161]  Seungjin Choi,et al.  Independent Component Analysis , 2009, Handbook of Natural Computing.

[162]  Haohong Wang,et al.  Highly accurate video object identification utilizing hint information , 2014, 2014 International Conference on Computing, Networking and Communications (ICNC).

[163]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[164]  Shu-Ching Chen,et al.  Feature Selection Using Correlation and Reliability Based Scoring Metric for Video Semantic Detection , 2010, 2010 IEEE Fourth International Conference on Semantic Computing.

[165]  S. Sumathi,et al.  Application of Artificial Bee Colony Optimization Algorithm for Image Classification Using Color and Texture Feature Similarity Fusion , 2012 .

[166]  Min Chen,et al.  Image database retrieval utilizing affinity relationships , 2003, MMDB '03.

[167]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[168]  Yungho Leu,et al.  A novel hybrid feature selection method for microarray data analysis , 2011, Appl. Soft Comput..

[169]  Bir Bhanu,et al.  Human Recognition at a Distance in Video , 2010, Advances in Pattern Recognition.

[170]  Stuart Harvey Rubin,et al.  A Human-Centered Multiple Instance Learning Framework for Semantic Video Retrieval , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[171]  John R. Smith,et al.  Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues , 2003, EURASIP J. Adv. Signal Process..

[172]  Choochart Haruechaiyasak,et al.  Category cluster discovery from distributed WWW directories , 2003, Inf. Sci..

[173]  Hatice Gunes,et al.  Audio-Visual Classification and Fusion of Spontaneous Affective Data in Likelihood Space , 2010, 2010 20th International Conference on Pattern Recognition.

[174]  Gerald Friedland,et al.  Acoustic super models for large scale video event detection , 2011, J-MRE '11.

[175]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[176]  Shu-Ching Chen,et al.  Video semantic concept detection via associative classification , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[177]  Tao Li,et al.  MADIS: A Multimedia-Aided Disaster information Integration System for emergency management , 2012, 8th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom).

[178]  Naphtali Rishe,et al.  A web-based spatial data access system using semantic R-trees , 2004, Inf. Sci..

[179]  Choochart Haruechaiyasak,et al.  Collaborative Filtering by Mining Association Rules from User Access Sequences , 2005, International Workshop on Challenges in Web Information Retrieval and Integration.

[180]  Mei-Ling Shyu,et al.  Leveraging Concept Association Network for Multimedia Rare Concept Mining and Retrieval , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[181]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[182]  Stéphane Ayache,et al.  Classifier Fusion for SVM-Based Multimedia Semantic Indexing , 2007, ECIR.

[183]  Zhao Li,et al.  Multimodal Sparse Linear Integration for Content-Based Item Recommendation , 2013, 2013 IEEE International Symposium on Multimedia.

[184]  Mei-Ling Shyu,et al.  Utilizing Context Information to Enhance Content-Based Image Classification , 2011, Int. J. Multim. Data Eng. Manag..

[185]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[186]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[187]  Bruce W. Suter,et al.  The multilayer perceptron as an approximation to a Bayes optimal discriminant function , 1990, IEEE Trans. Neural Networks.

[188]  Henning Müller,et al.  Information Fusion for Combining Visual and Textual Image Retrieval , 2010, 2010 20th International Conference on Pattern Recognition.

[189]  Min Chen,et al.  A latent semantic indexing based method for solving multiple instance learning problem in region-based image retrieval , 2005, Seventh IEEE International Symposium on Multimedia (ISM'05).

[190]  Jordan L. Boyd-Graber,et al.  Collecting Semantic Similarity Ratings to Connect Concepts in Assistive Communication Tools , 2012, Modeling, Learning, and Processing of Text Technological Data Structures.

[191]  Xue-wen Chen,et al.  FAST: a roc-based feature selection metric for small samples and imbalanced data classification problems , 2008, KDD.

[192]  Björn W. Schuller,et al.  Low-Level Fusion of Audio, Video Feature for Multi-Modal Emotion Recognition , 2008, VISAPP.

[193]  Chong-Wah Ngo,et al.  Domain adaptive semantic diffusion for large scale context-based video annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[194]  Zhihong Zeng,et al.  Audio–Visual Affective Expression Recognition Through Multistream Fused HMM , 2008, IEEE Transactions on Multimedia.

[195]  Shu-Ching Chen,et al.  Correlation-Based Deep Learning for Multimedia Semantic Concept Detection , 2015, WISE.

[196]  Petros Maragos,et al.  Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition , 2009, IEEE Trans. Speech Audio Process..

[197]  Rafael Berlanga Llavori,et al.  Finding association rules in semantic web data , 2012, Knowl. Based Syst..

[198]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[199]  Zhichun Mu,et al.  Feature Fusion Method Based on KCCA for Ear and Profile Face Based Multimodal Recognition , 2007, 2007 IEEE International Conference on Automation and Logistics.

[200]  Ji Wan,et al.  Deep Learning for Content-Based Image Retrieval: A Comprehensive Study , 2014, ACM Multimedia.

[201]  R. Prim Shortest connection networks and some generalizations , 1957 .

[202]  Rangasami L. Kashyap,et al.  Generalized Affinity-Based Association Rule Mining for Multimedia Database Queries , 2001, Knowledge and Information Systems.

[203]  Bir Bhanu,et al.  Feature Level Fusion of Face and Gait at a Distance , 2010 .

[204]  Rong Yan,et al.  Learning query-class dependent weights in automatic video retrieval , 2004, MULTIMEDIA '04.

[205]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[206]  Guna Seetharaman,et al.  Feature fusion using ranking for object tracking in aerial imagery , 2012 .

[207]  Min Chen,et al.  Correlation-based re-ranking for semantic concept detection , 2014, Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014).

[208]  Chun Chen,et al.  Audio-visual based emotion recognition using tripled hidden Markov model , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[209]  Mau-Tsuen Yang,et al.  A multimodal fusion system for people detection and tracking , 2005, Int. J. Imaging Syst. Technol..

[210]  Nicholas Ayache,et al.  Learning Semantic and Visual Similarity for Endomicroscopy Video Retrieval , 2012, IEEE Transactions on Medical Imaging.

[211]  Rangasami L. Kashyap,et al.  Temporal And Spatial Semantic Models For Multimedia Presentations , 1997 .

[212]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[213]  Shu-Ching Chen,et al.  Spatiotemporal vehicle tracking: the use of unsupervised learning-based segmentation and object tracking , 2005, IEEE Robotics & Automation Magazine.

[214]  Baoqing Jiang,et al.  Cross-Media Retrieval Method Based on Temporal-spatial Clustering and Multimodal Fusion , 2009, 2009 Fourth International Conference on Internet Computing for Science and Engineering.

[215]  Alessandro Moschitti,et al.  Supervised models for multimodal image retrieval based on visual, semantic and geographic information , 2012, 2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI).

[216]  Roger Levy,et al.  On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[217]  Shu-Ching Chen,et al.  Content-Based Multimedia Retrieval Using Feature Correlation Clustering and Fusion , 2013, Int. J. Multim. Data Eng. Manag..

[218]  Min Chen,et al.  FC-MST: Feature correlation maximum spanning tree for multimedia concept classification , 2015, Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015).

[219]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[220]  S. Eddy Hidden Markov models. , 1996, Current opinion in structural biology.

[221]  Dunja Mladenic,et al.  Feature Selection for Unbalanced Class Distribution and Naive Bayes , 1999, ICML.

[222]  Shenghuo Zhu,et al.  Deep Learning of Invariant Features via Simulated Fixations in Video , 2012, NIPS.

[223]  Gérard Chollet,et al.  Audio-Visual Speech Synchrony Measure for Talking-Face Identity Verification , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[224]  Bailing Zhang,et al.  Multiple features facial image retrieval by spectral regression and fuzzy aggregation approach , 2011, Int. J. Intell. Comput. Cybern..

[225]  Xin Huang,et al.  User Concept Pattern Discovery Using Relevance Feedback And Multiple Instance Learning For Content-Based Image Retrieval , 2002, MDM/KDD.

[226]  Hong Heather Yu,et al.  Overview and Future Trends of Multimedia Research for Content Access and Distribution , 2007, Int. J. Semantic Comput..

[227]  Huan Liu,et al.  Semi-supervised Feature Selection via Spectral Analysis , 2007, SDM.

[228]  Mei-Ling Shyu,et al.  Sparse Linear Integration of Content and Context Modalities for Semantic Concept Retrieval , 2015, IEEE Transactions on Emerging Topics in Computing.

[229]  Chalapathy Neti,et al.  Joint audio-visual speech processing for recognition and enhancement , 2003, AVSP.