MoVi: mobile phone based video highlights via collaborative sensing

Sensor networks have been conventionally defined as a network of sensor motes that collaboratively detect events and report them to a remote monitoring station. This paper makes an attempt to extend this notion to the social context by using mobile phones as replacement of motes. We envision a social application where mobile phones collaboratively sense their ambience and recognize socially "interesting" events. The phone with a good view of the event triggers a video recording, and later, the video-clips from different phones are "stitched" into a video highlights of the occasion. We observe that such a video highlights is akin to the notion of event coverage in conventional sensor networks, only the notion of "event" has changed from physical to social. We have built a Mobile Phone based Video Highlights system (MoVi) using Nokia phones and iPod Nanos, and have experimented in real-life social gatherings. Results show that MoVi-generated video highlights (created offline) are quite similar to those created manually, (i.e., by painstakingly editing the entire video of the occasion). In that sense, MoVi can be viewed as a collaborative information distillation tool capable of filtering events of social relevance.

[1]  Deborah Estrin,et al.  Image browsing, processing, and clustering for participatory sensing: lessons from a DietSense prototype , 2007, EmNets '07.

[2]  Mathias Creutz,et al.  Morphology-aware statistical machine translation based on morphs induced in an unsupervised manner , 2007, MTSUMMIT.

[3]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[4]  Jan-Michael Frahm,et al.  Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs , 2008, International Journal of Computer Vision.

[5]  Andreas Savvides,et al.  Lightweight People Counting and Localizing in Indoor Spaces Using Camera Sensor Nodes , 2007, 2007 First ACM/IEEE International Conference on Distributed Smart Cameras.

[6]  F. Harris On the use of windows for harmonic analysis with the discrete Fourier transform , 1978, Proceedings of the IEEE.

[7]  Romit Roy Choudhury,et al.  SurroundSense: mobile phone localization via ambience fingerprinting , 2009, MobiCom '09.

[8]  Toyoaki Nishida,et al.  Neary: conversation field detection based on similarity of auditory situation , 2009, HotMobile '09.

[9]  Alec Wolman,et al.  MAUI: making smartphones last longer with code offload , 2010, MobiSys '10.

[10]  Ophir Frieder,et al.  Information Retrieval: Algorithms and Heuristics (The Kluwer International Series on Information Retrieval) , 2004 .

[11]  Ramachandran Ramjee,et al.  Nericell: rich monitoring of road and traffic conditions using mobile smartphones , 2008, SenSys '08.

[12]  Steve Mann,et al.  Smart clothing: the shift to wearable computing , 1996, CACM.

[13]  Bernhard Rinner,et al.  Real-time video analysis on an embedded smart camera for traffic surveillance , 2004, Proceedings. RTAS 2004. 10th IEEE Real-Time and Embedded Technology and Applications Symposium, 2004..

[14]  Mirco Musolesi,et al.  Sensing meets mobile social networks: the design, implementation and evaluation of the CenceMe application , 2008, SenSys '08.

[15]  C.-C. Jay Kuo,et al.  Audio-guided audiovisual data segmentation, indexing, and retrieval , 1998, Electronic Imaging.

[16]  Steve Mann Smart clothing: The wearable computer and wearcam , 2005, Personal Technologies.

[17]  Wei Pan,et al.  SoundSense: scalable sound sensing for people-centric applications on mobile phones , 2009, MobiSys '09.

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  Romit Roy Choudhury,et al.  Micro-Blog: sharing and querying content through mobile phones and social participation , 2008, MobiSys '08.

[20]  Romit Roy Choudhury,et al.  VUPoints: collaborative sensing and video recording through mobile phones , 2009, MobiHeld '09.

[21]  Daniel P. W. Ellis,et al.  Laughter Detection in Meetings , 2004 .

[22]  Katharina Morik,et al.  A Benchmark Dataset for Audio Classification and Clustering , 2005, ISMIR.

[23]  Jeroen Breebaart,et al.  Features for audio and music classification , 2003, ISMIR.

[24]  Yan Ke,et al.  The Design of High-Level Features for Photo Quality Assessment , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Steve Hodges,et al.  Neuropsychological Rehabilitation , 2013 .

[26]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[27]  Stanley T. Birchfield,et al.  Spatiograms versus histograms for region-based tracking , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[28]  Claudio Carpineto,et al.  Mobile information retrieval with search results clustering: Prototypes and evaluations , 2009, J. Assoc. Inf. Sci. Technol..

[29]  Joemon M. Jose,et al.  An Audio-Based Sports Video Segmentation and Event Detection Algorithm , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[30]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[31]  Romit Roy Choudhury,et al.  SurroundSense: mobile phone localization using ambient sound and light , 2009, MOCO.

[32]  Maribeth Gandy Coleman,et al.  The Gesture Pendant: A Self-illuminating, Wearable, Infrared Computer Vision System for Home Automation Control and Medical Monitoring , 2000, Digest of Papers. Fourth International Symposium on Wearable Computers.

[33]  Ali Dada,et al.  Displaying Dynamic Carbon Footprints of Products on Mobile Phones , 2008 .

[34]  Ingvar Claesson,et al.  Face Detection using Local SMQT Features and Split up Snow Classifier , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[35]  Hugo Zaragoza,et al.  Information Retrieval: Algorithms and Heuristics , 2002, Information Retrieval.

[36]  Deborah Estrin,et al.  PEIR, the personal environmental impact report, as a platform for participatory sensing systems research , 2009, MobiSys '09.

[37]  Nathan Eagle,et al.  Dealing with Distance: Capturing the Details of Collocation with Wearable Computers , 2003 .