Evolutionary concept learning from cartoon videos by multimodal hypernetworks

Concepts have been widely used for categorizing and representing knowledge in artificial intelligence. Previous researches on concept learning have focused on unimodal data, usually on linguistic domains in a static environment. Concept learning from multimodal stream data, such as videos, remains a challenge due to their dynamic change and high-dimensionality. Here we propose an evolutionary method that simulates the process of human concept learning from multimodal video streams. Two key ideas on evolutionary concept learning are representing concepts in a large collection (population) of hyperedges or a hypergraph and to incrementally learning from video streams based on an evolutionary approach. The hypergraph is learned "evolutionarily" by repeating the generation and selection process of hyperedge concepts from the video data. The advantage of this evolutionary learning process is that the population-based distributed coding allows flexible and robust trace of the change of concept relations as the video story unfolds. We evaluate the proposed method on a suite of children's cartoon videos for 517 minutes of total playing time. Experimental results show that the proposed method effectively represents visual-textual concept relations and our evolutionary concept learning method effectively models the conceptual change as an evolutionary process. We also investigate the structure properties of the constructed concept networks.

[1]  Joshua B. Tenenbaum,et al.  The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth , 2001, Cogn. Sci..

[2]  Byoung-Tak Zhang,et al.  Mutual information-based evolution of hypernetworks for brain data analysis , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[3]  Byoung-Tak Zhang,et al.  Molecular programming: evolving genetic programs in a test tube , 2005, GECCO '05.

[4]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[5]  Weiguang Qu,et al.  A Semi-Supervised Key Phrase Extraction Approach: Learning from Title Phrases through a Document Semantic Network , 2010, ACL.

[6]  R. Quiroga Concept cells: the building blocks of declarative memory functions , 2012, Nature Reviews Neuroscience.

[7]  Byoung-Tak Zhang,et al.  Sparse Population Code Models of Word Learning in Concept Drift , 2012, CogSci.

[8]  Pedro Larrañaga,et al.  Estimation of Distribution Algorithms , 2002, Genetic Algorithms and Evolutionary Computation.

[9]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[10]  Byoung-Tak Zhang,et al.  Evolving hypernetworks for pattern classification , 2007, 2007 IEEE Congress on Evolutionary Computation.

[11]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[12]  Lei Chen,et al.  Structure Tensor Series-Based Large Scale Near-Duplicate Video Retrieval , 2012, IEEE Transactions on Multimedia.

[13]  W. Schneider,et al.  Perceptual Knowledge Retrieval Activates Sensory Brain Regions , 2006, The Journal of Neuroscience.

[14]  Yu Hao,et al.  Semantic Relationship Discovery with Wikipedia Structure , 2011, IJCAI.

[15]  Simone Paolo Ponzetto,et al.  BabelNet: Building a Very Large Multilingual Semantic Network , 2010, ACL.

[16]  Byoung-Tak Zhang,et al.  Hypernetworks: A Molecular Evolutionary Architecture for Cognitive Learning and Memory , 2008, IEEE Computational Intelligence Magazine.

[17]  Michael Strube,et al.  Transforming Wikipedia into a large scale multilingual concept network , 2013, Artif. Intell..

[18]  P. Gordon,et al.  Memory interference during language processing. , 2001, Journal of experimental psychology. Learning, memory, and cognition.

[19]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[20]  Byoung-Tak Zhang,et al.  Evolutionary layered hypernetworks for identifying microRNA-mRNA regulatory modules , 2010, IEEE Congress on Evolutionary Computation.

[21]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[22]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[23]  Byoung-Tak Zhang,et al.  Text-to-image retrieval based on incremental association via multimodal hypernetworks , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[24]  Stuart C. Shapiro,et al.  THE SNePS SEMANTIC NETWORK PROCESSING SYSTEM , 1979 .