LAVES: an instant mobile video search system based on layered audio-video indexing

This demonstration presents an innovative instant mobile video search system based on layered audio-video indexing, called "LAVES." Through the system, users can discover videos by simply pointing their phones at a screen to capture a very few seconds of what they are watching. Unlike most existing mobile video search applications which simply send the original video query to the cloud, the proposed mobile system is one of the first attempts towards instant and progressive video search leveraging the light-weight computing capacity of mobile devices. The system is able to index large-scale video data using the layered audio-video indexing technique on the cloud, as well as extract light-weight joint audio-video signatures in real time and perform bipartite-graph-based progressive search process on the devices. On a 600 hours video dataset, the system can outperform the state-of-the-arts by achieving 90.79% precision when the query video is less than 10 seconds.

[1]  Shih-Fu Chang,et al.  Mobile product search with Bag of Hash Bits and boundary reranking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Zi Huang,et al.  Effective and Efficient Query Processing for Video Subsequence Identification , 2009, IEEE Transactions on Knowledge and Data Engineering.

[3]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[4]  David J. Fleet,et al.  Fast search in Hamming space with multi-index hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Avery Wang,et al.  An Industrial Strength Audio Search Algorithm , 2003, ISMIR.