4CeeD: Real-Time Data Acquisition and Analysis Framework for Material-Related Cyber-Physical Environments

In this paper, we present a data acquisition and analysis framework for materials-to-devices processes, named 4CeeD, that focuses on the immense potential of capturing, accurately curating, correlating, and coordinating materials-to-devices digital data in a real-time and trusted manner before fully archiving and publishing them for wide access and sharing. In particular, 4CeeD consists of novel services: a curation service for collecting data from microscopes and fabrication instruments, curating, and wrapping of data with extensive metadata in real-time and in a trusted manner, and a cloud-based coordination service for storing data, extracting meta-data, analyzing and finding correlations among the data. Our evaluation results show that our novel cloud framework can help researchers significantly save time and cost spent on experiments, and is efficient in dealing with high-volume and fast-changing workload of heterogeneous types of experimental data.

[1]  Alexander H. G. Rinnooy Kan,et al.  Machine allocation algorithms for job shop manufacturing , 1991, J. Intell. Manuf..

[2]  Peter Z. Kunszt,et al.  The SDSS skyserver: public access to the sloan digital sky server data , 2001, SIGMOD '02.

[3]  Anne-Marie Kermarrec,et al.  The many faces of publish/subscribe , 2003, CSUR.

[4]  Rajkumar Buyya,et al.  A taxonomy of scientific workflow systems for grid computing , 2005, SGMD.

[5]  Gerhard Klimeck,et al.  nanoHUB.org: Advancing Education and Research in Nanotechnology , 2008, Computing in Science & Engineering.

[6]  Michael McLennan,et al.  HUBzero: A Platform for Dissemination and Collaboration in Computational Science and Engineering , 2010, Computing in Science & Engineering.

[7]  Andrey Gubarev,et al.  Dremel : Interactive Analysis of Web-Scale Datasets , 2011 .

[8]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.

[9]  Reza Sherafat Kazemzadeh,et al.  The PADRES Publish/Subscribe System , 2010, Principles and Applications of Distributed Event-Based Systems.

[10]  Esteban Zimányi,et al.  EQS: An Elastic and Scalable Message Queue for the Cloud , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.

[11]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[12]  Ming Li,et al.  A Scalable and Elastic Publish/Subscribe Service , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[13]  Charles H. Ward Materials Genome Initiative for Global Competitiveness , 2012 .

[14]  Ruth E. Duerr,et al.  The Data Conservancy Instance: Infrastructure and Organizational Services for Research Data Curation , 2012, D Lib Mag..

[15]  Carlo Curino,et al.  Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.

[16]  Inna Kouper,et al.  SEAD Virtual Archive: Building a Federation of Institutional Repositories for Long-Term Data Preservation in Sustainability Science , 2013, Int. J. Digit. Curation.

[17]  Ann L. Chervenak,et al.  Characterizing and profiling scientific workflows , 2013, Future Gener. Comput. Syst..

[18]  Maarten van Steen,et al.  Cost-Effective Resource Allocation for Deploying Pub/Sub on Cloud , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[19]  C. Strasser,et al.  DataUp: A tool to help researchers describe and share tabular data. , 2014, F1000Research.

[20]  Yin Yang,et al.  DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams , 2015, 2015 IEEE 35th International Conference on Distributed Computing Systems.

[21]  Jörg Kienzle,et al.  Dynamoth: A Scalable Pub/Sub Middleware for Latency-Constrained Applications in the Cloud , 2015, 2015 IEEE 35th International Conference on Distributed Computing Systems.

[22]  Marta Mattoso,et al.  A Survey of Data-Intensive Scientific Workflow Management , 2015, Journal of Grid Computing.

[23]  Rui Liu,et al.  Brown Dog: Leveraging everything towards autocuration , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[24]  Klara Nahrstedt,et al.  Resource Management for Elastic Publish Subscribe Systems: A Performance Modeling-Based Approach , 2016, 2016 IEEE 9th International Conference on Cloud Computing (CLOUD).