Agentless cloud-wide monitoring of virtual disk state

Abstract : This dissertation proposes a fundamentally different way of monitoring persistent storage. It introduces a monitoring platform based on the modern reality of software defined storage which enables the decoupling of policy from mechanism. The proposed platform is both agentlessmeaning it operates external to and independent of the entities it monitorsand scalablemeaning it is designed to address many systems at once with a mixture of operating systems and applications. Concretely, this dissertation focuses on virtualized clouds, but the proposed monitoring platform generalizes to any form of persistent storage. The core mechanism this dissertation introduces is called Distributed Streaming Virtual Machine Introspection (DS-VMI), and it leverages two properties of modern clouds: virtualized servers managed by hypervisors enabling efficient introspection, and file-level duplication of data within cloud instances. We explore a new class of agentless monitoring applications via three interfaces with two different consistency models: cloud-inotify (strong consistency), /cloud (eventual consistency), and /cloud-history (strong consistency). cloud inotify is a publish-subscribe interface to cloud-wide file-level updates and it supports event-based monitoring applications. /cloud is designed to support batch-based and legacy monitoring applications by providing a file system interface to cloud-wide file-level state. /cloud-history is designed to support efficient search and management of historic virtual disk state. It leverages new fast-to-access archival storage systems, and achieves tractable indexing of file-level history via whole-file deduplication using a novel application of an incremental hashing construction.

[1]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1988, TOCS.

[2]  Miguel Castro,et al.  Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[3]  Xuxian Jiang,et al.  Stealthy malware detection through vmm-based "out-of-the-box" semantic view reconstruction , 2007, CCS '07.

[4]  Yangchun Fu,et al.  Space Traveling across VM: Automatically Bridging the Semantic Gap in Virtual Machine Introspection via Online Kernel Data Redirection , 2012, 2012 IEEE Symposium on Security and Privacy.

[5]  Jeffrey C. Mogul,et al.  The packer filter: an efficient mechanism for user-level network code , 1987, SOSP '87.

[6]  Andrea C. Arpaci-Dusseau,et al.  End-to-end Data Integrity for File Systems: A ZFS Case Study , 2010, FAST.

[7]  Ben Y. Zhao,et al.  Efficient Batched Synchronization in Dropbox-Like Cloud Storage Services , 2013, Middleware.

[8]  Tal Garfinkel,et al.  A Virtual Machine Introspection Based Architecture for Intrusion Detection , 2003, NDSS.

[9]  Grant Wallace,et al.  Efficiently Storing Virtual Machine Backups , 2013, HotStorage.

[10]  Marcos K. Aguilera,et al.  Olive: Distributed Point-in-Time Branching Storage for Real Systems , 2006, NSDI.

[11]  Chandramohan A. Thekkath,et al.  Petal: distributed virtual disks , 1996, ASPLOS VII.

[12]  Mahadev Satyanarayanan,et al.  Near-Real-Time Inference of File-Level Mutations from Virtual Disk Writes , 2012 .

[13]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[14]  Peng Ning,et al.  Managing security of virtual machine images in a cloud environment , 2009, CCSW '09.

[15]  Jeffrey D. Case,et al.  Simple Network Management Protocol (SNMP) , 1989, RFC.

[16]  Samuel J. Leffler,et al.  The design and implementation of the 4.3 BSD Unix operating system , 1991, Addison-Wesley series in computer science.

[17]  Jeffrey Katcher,et al.  PostMark: A New File System Benchmark , 1997 .

[18]  Alexey Melnikov,et al.  The WebSocket Protocol , 2011, RFC.

[19]  Bryan D. Payne,et al.  Simplifying virtual machine introspection using LibVMI. , 2012 .

[20]  Xuxian Jiang,et al.  "Out-of-the-Box" Monitoring of VM-Based High-Interaction Honeypots , 2007, RAID.

[21]  Marvin Theimer,et al.  Reclaiming space from duplicate files in a serverless distributed file system , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[22]  Mahadev Satyanarayanan,et al.  The Case for Content Search of VM Clouds , 2010, 2010 IEEE 34th Annual Computer Software and Applications Conference Workshops.

[23]  Ethan L. Miller,et al.  The effectiveness of deduplication on virtual machine disk images , 2009, SYSTOR '09.

[24]  Hyong S. Kim,et al.  How to tame your VMs: an automated control system for virtualized services , 2010 .

[25]  Muli Ben-Yehuda,et al.  Block storage listener for detecting file-level intrusions , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[26]  Kai Li,et al.  Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , 2008, FAST.

[27]  Muli Ben-Yehuda,et al.  Virtual machine time travel using continuous data protection and checkpointing , 2008, OPSR.

[28]  Bowen Alpern,et al.  Opening black boxes: using semantic information to combat virtual machine image sprawl , 2008, VEE '08.

[29]  Jonathon T. Giffin,et al.  2011 IEEE Symposium on Security and Privacy Virtuoso: Narrowing the Semantic Gap in Virtual Machine Introspection , 2022 .

[30]  Dutch T. Meyer,et al.  Parallax: virtual disks for virtual machines , 2008, Eurosys '08.

[31]  Ivan Damgård,et al.  A Design Principle for Hash Functions , 1989, CRYPTO.

[32]  Mihir Bellare,et al.  A New Paradigm for Collision-Free Hashing: Incrementality at Reduced Cost , 1997, EUROCRYPT.

[33]  Shensheng Zhang,et al.  Virtual Disk Monitor Based on Multi-core EFI , 2007, APPT.

[34]  Nikolai Joukov,et al.  A nine year study of file system and storage benchmarking , 2008, TOS.

[35]  Dutch T. Meyer,et al.  A study of practical deduplication , 2011, TOS.

[36]  Kenneth van Surksum IDC: Worldwide Market for Enterprise Server Virtualization will reach $19.3 Billion by 2014 , 2010 .

[37]  Mohammad Banikazemi,et al.  Sysman: A Virtual File System for Managing Clusters , 2008, LISA.

[38]  Eric Jul,et al.  Lithium: virtual machine storage for the cloud , 2010, SoCC '10.

[39]  Ralph C. Merkle,et al.  Secrecy, authentication, and public key systems , 1979 .

[40]  Liviu Iftode,et al.  Bringing the Cloud Down to Earth: Transient PCs Everywhere , 2010, MobiCASE.

[41]  Mahadev Satyanarayanan,et al.  Diamond: A Storage Architecture for Early Discard in Interactive Search , 2004, FAST.

[42]  Scott Smith,et al.  Keeping Track of 70, 000+ Servers: The Akamai Query System , 2010, LISA.

[43]  Haibo Chen,et al.  CloudVisor: retrofitting protection of virtual machines in multi-tenant cloud with nested virtualization , 2011, SOSP.

[44]  Mahadev Satyanarayanan,et al.  Agentless Cloud-Wide Streaming of Guest File System Updates , 2014, 2014 IEEE International Conference on Cloud Engineering.

[45]  Norman C. Hutchinson,et al.  Deciding when to forget in the Elephant file system , 1999, SOSP.

[46]  Eyal de Lara,et al.  Privacy-Sensitive VM Retrospection , 2011, HotCloud.

[47]  Eyal de Lara,et al.  The Manna Plug-In Architecture for Content-based Search of VM Clouds , 2010 .

[48]  Fred Douglis,et al.  Characteristics of backup workloads in production systems , 2012, FAST.

[49]  Josef Bacik,et al.  BTRFS: The Linux B-Tree Filesystem , 2013, TOS.

[50]  Wenke Lee,et al.  Secure and Flexible Monitoring of Virtual Machines , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[51]  Mahadev Satyanarayanan,et al.  Coda: A Highly Available File System for a Distributed Workstation Environment , 1990, IEEE Trans. Computers.

[52]  Ralph C. Merkle,et al.  A Digital Signature Based on a Conventional Encryption Function , 1987, CRYPTO.

[53]  Yagiz Onat Yazir,et al.  Maitland: Lighter-Weight VM Introspection to Support Cyber-security in the Cloud , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[54]  Darrell Reimer,et al.  Virtual Machine Images as Structured Data: The Mirage Image Library , 2011, HotCloud.

[55]  Claudia Eckert,et al.  A formal model for virtual machine introspection , 2009, VMSec '09.

[56]  Garth A. Gibson,et al.  Embedded Security for Network-Attached Storage, , 1999 .

[57]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[58]  Andrea C. Arpaci-Dusseau,et al.  A logic of file systems , 2005, FAST'05.

[59]  Andrea C. Arpaci-Dusseau,et al.  Semantically-Smart Disk Systems , 2003, FAST.

[60]  Wei Wang,et al.  ReconFS: a reconstructable file system on flash storage , 2014, FAST.

[61]  Sean Matthew Dorward,et al.  Awarded Best Paper! - Venti: A New Approach to Archival Data Storage , 2002 .

[62]  A. Kivity,et al.  kvm : the Linux Virtual Machine Monitor , 2007 .

[63]  Andrea C. Arpaci-Dusseau,et al.  A File Is Not a File: Understanding the I/O Behavior of Apple Desktop Applications , 2012, TOCS.

[64]  William A. Wulf,et al.  Policy/mechanism separation in Hydra , 1975, SOSP.

[65]  Dongsheng Wang,et al.  Virtual-Machine-based Intrusion Detection on File-aware Block Level Storage , 2006, 2006 18th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'06).

[66]  David E. Culler,et al.  The ganglia distributed monitoring system: design, implementation, and experience , 2004, Parallel Comput..

[67]  Jeffrey C. Mogul,et al.  Simple and Flexible Datagram Access Controls for UNIX-based Gateways , 1999 .

[68]  Lukasz Kufel Security Event Monitoring in a Distributed Systems Environment , 2013, IEEE Security & Privacy.