The Manna Plug-In Architecture for Content-based Search of VM Clouds

As cloud computing becomes more popular, collections of virtual machine (VM) images are growing in size. Management of VM collections requires the ability to inspect and search data stored within VM images. We present a plug-in-based architecture, called Manna, for efficiently searching state within VM images through both index and non-index based search. The architecture offers a flexible framework for creating a wide range of new applications that are valuable to both end users and administrators of VM images. We showcase this flexibility through three applications built using Manna’s API: one for searching images, one for searching source code, and one for performing virus scanning. Efficient search for such diverse applications is achieved using two independent mechanisms: plug-ins that are data-type-specific, but independent of data source, and use of VM-specific metadata to shrink the search space of non-indexed data. Trace-driven measurements on our prototype confirm that Manna searches incur low performance overhead.

[1]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[2]  Mahadev Satyanarayanan,et al.  Pervasive Personal Computing in an Internet Suspend/Resume System , 2007, IEEE Internet Computing.

[3]  Roy T. Fielding,et al.  Uniform Resource Identifier (URI): Generic Syntax , 2005, RFC.

[4]  Andrew Chi-Chih Yao,et al.  A general approach to d-dimensional geometric queries , 1985, STOC '85.

[5]  Mahadev Satyanarayanan,et al.  The Case for Content Search of VM Clouds , 2010, 2010 IEEE 34th Annual Computer Software and Applications Conference Workshops.

[6]  Mladen A. Vouk,et al.  Using VCL technology to implement distributed reconfigurable data centers and computational services for educational institutions , 2009, IBM J. Res. Dev..

[7]  Geoffrey Zweig,et al.  Syntactic Clustering of the Web , 1997, Comput. Networks.

[8]  Rosalind W. Picard,et al.  Interactive Learning Using a "Society of Models" , 2017, CVPR 1996.

[9]  Mahadev Satyanarayanan,et al.  The unique strengths and storage access characteristics of discard-based search , 2010, Journal of Internet Services and Applications.

[10]  W. Bruce Croft,et al.  Search Engines - Information Retrieval in Practice , 2009 .

[11]  Bowen Alpern,et al.  Opening black boxes: using semantic information to combat virtual machine image sprawl , 2008, VEE '08.

[12]  Peng Ning,et al.  Managing security of virtual machine images in a cloud environment , 2009, CCSW '09.

[13]  Anthony Liguori,et al.  Experiences with Content Addressable Storage and Virtual Disks , 2008, Workshop on I/O Virtualization.

[14]  Mahadev Satyanarayanan,et al.  Internet suspend/resume , 2002, Proceedings Fourth IEEE Workshop on Mobile Computing Systems and Applications.

[15]  Stefan Berger,et al.  RC2 - A Living Lab for Cloud Computing , 2010, LISA.

[16]  Sean Quinlan,et al.  Venti: A New Approach to Archival Storage , 2002, FAST.

[17]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[18]  Mahadev Satyanarayanan,et al.  Design Tradeoffs in Applying Content Addressable Storage to Enterprise-scale Systems Based on Virtual Machines , 2006, USENIX Annual Technical Conference, General Track.

[19]  Mahadev Satyanarayanan,et al.  Diamond: A Storage Architecture for Early Discard in Interactive Search , 2004, FAST.

[20]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.