Supporting Semantic Concept Retrieval with Negative Correlations in a Multimedia Big Data Mining System

With the extensive use of smart devices and blooming popularity of social media websites such as Flickr, YouTube, Twitter, and Facebook, we have witnessed an explosion of multimedia data. The amount of data nowadays is formidable without effective big data technologies. It is well-acknowledged that multimedia high-level semantic concept mining and retrieval has become an important research topic; while the semantic gap (i.e., the gap between the low-level features and high-level concepts) makes it even more challenging. To address these challenges, it requires the joint research efforts from both big data mining and multimedia areas. In particular, the correlations among the classes can provide important context cues to help bridge the semantic gap. However, correlation discovery is computationally expensive due to the huge amount of data. In this paper, a novel multimedia big data mining system based on the MapReduce framework is proposed to discover negative correlations for semantic concept mining and retrieval. Furthermore, the proposed multimedia big data mining system consists of a big data processing platform with Mesos for efficient resource management and with Cassandra for handling data across multiple data centers. Experimental results on the TRECVID benchmark datasets demonstrate the feasibility and the effectiveness of the proposed multimedia big data mining system with negative correlation discovery for semantic concept mining and retrieval.

[1]  Min Chen,et al.  A unified framework for image database clustering and content-based retrieval , 2004, MMDB '04.

[2]  P. Westfall,et al.  Understanding Advanced Statistical Methods , 2013 .

[3]  Min Chen,et al.  A latent semantic indexing based method for solving multiple instance learning problem in region-based image retrieval , 2005, Seventh IEEE International Symposium on Multimedia (ISM'05).

[4]  Xiuqi Li,et al.  An effective content-based visual image retrieval system , 2002, Proceedings 26th Annual International Computer Software and Applications.

[5]  Chong-Wah Ngo,et al.  Domain adaptive semantic diffusion for large scale context-based video annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6]  Mei-Ling Shyu,et al.  Effective Feature Space Reduction with Imbalanced Data for Semantic Concept Detection , 2008, 2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing (sutc 2008).

[7]  Irene Kaimi Understanding Advanced Statistical Methods P. Westfall and K. S. S. Henning, 2013 Boca Raton, Chapman and Hall–CRC 570 pp., £44.99 ISBN 978-1-466-51210-8 , 2015 .

[8]  Wicher Bergsma,et al.  A bias-correction for Cramér’s and Tschuprow’s , 2013 .

[9]  Rangasami L. Kashyap,et al.  Augmented Transition Network as a Semantic Model for Video Data , 2001 .

[10]  Chao Chen,et al.  Weighted Subspace Filtering and Ranking Algorithms for Video Concept Retrieval , 2011, IEEE MultiMedia.

[11]  Min Chen,et al.  Video Semantic Event/Concept Detection Using a Subspace-Based Multimedia Data Mining Framework , 2008, IEEE Transactions on Multimedia.

[12]  Choochart Haruechaiyasak,et al.  Collaborative Filtering by Mining Association Rules from User Access Sequences , 2005, International Workshop on Challenges in Web Information Retrieval and Integration.

[13]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[14]  Chengcui Zhang,et al.  An intelligent framework for spatio-temporal vehicle tracking , 2001, ITSC 2001. 2001 IEEE Intelligent Transportation Systems. Proceedings (Cat. No.01TH8585).

[15]  Chengcui Zhang,et al.  A Dynamic User Concept Pattern Learning Framework for Content-Based Image Retrieval , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[16]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[17]  Rangasami L. Kashyap,et al.  Augmented transition networks as video browsing models for multimedia databases and multimedia information systems , 1999, Proceedings 11th International Conference on Tools with Artificial Intelligence.

[18]  Choochart Haruechaiyasak,et al.  Category cluster discovery from distributed WWW directories , 2003, Inf. Sci..

[19]  L. A. Goodman,et al.  Measures of Association for Cross Classifications, IV: Simplification of Asymptotic Variances , 1972 .

[20]  Mei-Ling Shyu,et al.  Negative Correlation Discovery for Big Multimedia Data Semantic Concept Mining and Retrieval , 2016, 2016 IEEE Tenth International Conference on Semantic Computing (ICSC).

[21]  Yang Liu,et al.  Enhancing Multimedia Semantic Concept Mining and Retrieval by Incorporating Negative Correlations , 2014, 2014 IEEE International Conference on Semantic Computing.

[22]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Rangasami L. Kashyap,et al.  Temporal And Spatial Semantic Models For Multimedia Presentations , 1997 .

[24]  Shu-Ching Chen,et al.  Network intrusion detection through Adaptive Sub-Eigenspace Modeling in multiagent systems , 2007, ACM Trans. Auton. Adapt. Syst..

[25]  Mei-Ling Shyu,et al.  Weighted Association Rule Mining for Video Semantic Detection , 2010, Int. J. Multim. Data Eng. Manag..

[26]  Rangasami L. Kashyap,et al.  Identifying Overlapped Objects for Video Indexing and Modeling in Multimedia Database Systems , 2001, Int. J. Artif. Intell. Tools.

[27]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[28]  Scott Shenker,et al.  Shark: SQL and rich analytics at scale , 2012, SIGMOD '13.

[29]  K. Pearson VII. Note on regression and inheritance in the case of two parents , 1895, Proceedings of the Royal Society of London.

[30]  Min Chen,et al.  Image database retrieval utilizing affinity relationships , 2003, MMDB '03.

[31]  Chengcui Zhang,et al.  Innovative Shot Boundary Detection for Video Indexing , 2005 .

[32]  Lei Zhao,et al.  A Crosstab-based Statistical Method for Effective Fault Localization , 2008, 2008 1st International Conference on Software Testing, Verification, and Validation.

[33]  Everton Alvares Cherman,et al.  Incorporating label dependency into the binary relevance framework for multi-label classification , 2012, Expert Syst. Appl..

[34]  Xiuqi Li,et al.  Image Retrieval By Color , Texture , And Spatial Information , 2002 .

[35]  Phil Howlett,et al.  Matching the grade correlation coefficient using a copula with maximum disorder , 2007 .

[36]  Rangasami L. Kashyap,et al.  Generalized Affinity-Based Association Rule Mining for Multimedia Database Queries , 2001, Knowledge and Information Systems.

[37]  Shu-Ching Chen,et al.  Video Semantic Concept Discovery using Multimodal-Based Association Classification , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[38]  Xin Huang,et al.  User Concept Pattern Discovery Using Relevance Feedback And Multiple Instance Learning For Content-Based Image Retrieval , 2002, MDM/KDD.

[39]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[40]  Mei-Ling Shyu,et al.  Sparse Linear Integration of Content and Context Modalities for Semantic Concept Retrieval , 2015, IEEE Transactions on Emerging Topics in Computing.