Experiential Sampling in Multimedia Systems

Multimedia systems must deal with multiple data streams. Each data stream usually contains significant volume of redundant noisy data. In many real-time applications, it is essential to focus the computing resources on a relevant subset of data streams at any given time instant and use it to build the model of the environment. We formulate this problem as an experiential sampling problem and propose an approach to utilize computing resources efficiently on the most informative subset of data streams. First, in this paper, we focus on theoretical background and develop a theoretical framework for a single data stream. We generalize the notion of static visual attention in a dynamical systems setting and propose a dynamical attention-orientated analysis method. This is achieved by a sampling representation that utilizes the current context and past experience for attention evolution. Hence, the multimedia analysis task at hand can select its data of interest while immediately discarding the irrelevant data to achieve efficiency and adaptability

[1]  E. Chang Wavelet Foveation , 1999 .

[2]  David Chiu,et al.  BOOK REVIEW: "PATTERN CLASSIFICATION", R. O. DUDA, P. E. HART and D. G. STORK, Second Edition , 2001 .

[3]  Laurent Itti,et al.  A New Robotics Platform for Neuromorphic Vision: Beobots , 2002, Biologically Motivated Computer Vision.

[4]  Ramesh Jain,et al.  Experiential Sampling for video surveillance , 2003, IWVS '03.

[5]  David A. Forsyth,et al.  The Joy of Sampling , 2004, International Journal of Computer Vision.

[6]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[7]  E. Miller,et al.  THE PREFRONTAL CORTEX AND COGNITIVE CONTROL , 2000 .

[8]  Ramesh C. Jain Semantics in Multimedia Systems , 1994, IEEE Multim..

[9]  Gary Bradski,et al.  Computer Vision Face Tracking For Use in a Perceptual User Interface , 1998 .

[10]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[11]  Simon J. Godsill,et al.  On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[12]  W. James,et al.  The Principles of Psychology. , 1983 .

[13]  Stéphane Lafortune,et al.  On an Optimization Problem in Sensor Selection* , 2002, Discret. Event Dyn. Syst..

[14]  W. Richards,et al.  Model structure and reliable inference , 1996 .

[15]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[16]  Harry Shum,et al.  Statistical Learning of Multi-view Face Detection , 2002, ECCV.

[17]  Benoît Maison,et al.  Joint processing of audio and visual information for multimedia indexing and human-computer interaction , 2000, RIAO.

[18]  Mohan S. Kankanhalli,et al.  Experience based sampling technique for multimedia analysis , 2003, MULTIMEDIA '03.

[19]  Geoffrey E. Hinton,et al.  Parameter estimation for linear dynamical systems , 1996 .

[20]  D. Rubin Using the SIR algorithm to simulate posterior distributions , 1988 .

[21]  Henry Lieberman,et al.  Out of context: Computer systems that adapt to, and learn from, context , 2000, IBM Syst. J..

[22]  Michael Nikolaou,et al.  Model predictive controllers: A critical synthesis of theory and industrial needs , 2001 .

[23]  Andy Hopper,et al.  The active badge location system , 1992, TOIS.

[24]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  P. Fearnhead,et al.  Building Robust Simulation-based Filters for Evolving Data Sets , 2007 .

[26]  Bin Yu,et al.  Model Selection and the Principle of Minimum Description Length , 2001 .

[27]  M. S. Mayzner,et al.  Cognition And Reality , 1976 .

[28]  Steven B. Most,et al.  How not to be Seen: The Contribution of Similarity and Selective Ignoring to Sustained Inattentional Blindness , 2001, Psychological science.

[29]  Laurent Itti,et al.  A Goal Oriented Attention Guidance Model , 2002, Biologically Motivated Computer Vision.

[30]  B. Scholl Objects and attention: the state of the art , 2001, Cognition.

[31]  Paolo Dario,et al.  Integrating Selective Attention and Space-Variant Sensing in Machine Vision , 1996 .

[32]  George Chapline Minimum energy information fusion in sensor networks , 1999 .

[33]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Giorgio Bonmassar,et al.  Space-variant active vision: Definition, overview and examples , 1995, Neural Networks.

[35]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[36]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[37]  Harriet J. Nock,et al.  Audio-visual synchrony for detection of monologues in video archives , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[38]  Christof Koch,et al.  Feature combination strategies for saliency-based visual attention systems , 2001, J. Electronic Imaging.

[39]  Mohan S. Kankanhalli,et al.  Adaptive monitoring for video surveillance , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[40]  Christof Koch,et al.  Attentional Selection for Object Recognition - A Gentle Way , 2002, Biologically Motivated Computer Vision.

[41]  Jun S. Liu,et al.  Sequential Monte Carlo methods for dynamic systems , 1997 .

[42]  Jun Wang DETECTING AND TRACKING HUMAN FACES IN COMPRESSED DOMAIN FOR CONTENT BASED VIDEO INDEXING , 2002 .

[43]  Mohan S. Kankanhalli,et al.  Video content representation on tiny devices , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[44]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[45]  A. Cabrales,et al.  Equilibrium selection through incomplete information in coordination games: an experimental study , 2007 .

[46]  Ramesh C. Jain Experiential computing , 2003, CACM.

[47]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[48]  Richard Ford,et al.  How Not to Be Seen , 2007, IEEE Security & Privacy.

[49]  Ishwar K. Sethi,et al.  Multimedia content processing through cross-modal association , 2003, MULTIMEDIA '03.

[50]  Mohan S. Kankanhalli,et al.  A hierarchical framework for face tracking using state vector fusion for compressed video , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[51]  Mohan S. Kankanhalli,et al.  Experiential Sampling on Multiple Data Streams , 2006, IEEE Transactions on Multimedia.

[52]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  Gregory D. Abowd,et al.  Towards a Better Understanding of Context and Context-Awareness , 1999, HUC.

[54]  E. Miller,et al.  The prefontral cortex and cognitive control , 2000, Nature Reviews Neuroscience.