Massivizing Computer Systems: A Vision to Understand, Design, and Engineer Computer Ecosystems Through and Beyond Modern Distributed Systems

Our society is digital: industry, science, governance, and individuals depend, often transparently, on the inter-operation of large numbers of distributed computer systems. Although the society takes them almost for granted, these computer ecosystems are not available for all, may not be affordable for long, and raise numerous other research challenges. Inspired by these challenges and by our experience with distributed computer systems, we envision Massivizing Computer Systems, a domain of computer science focusing on understanding, controlling, and evolving successfully such ecosystems. Beyond establishing and growing a body of knowledge about computer ecosystems and their constituent systems, the community in this domain should also aim to educate many about design and engineering for this domain, and all people about its principles. This is a call to the entire community: there is much to discover and achieve.

[1]  Tijs van der Storm,et al.  Solving the bank with Rebel: on the design of the Rebel specification language and its application inside a bank , 2016, ITSLE@SPLASH.

[2]  Moshe Y. Vardi Computer professionals for social responsibility , 2017, Commun. ACM.

[3]  Chris Arney Probably Approximately Correct: Nature's Algorithms for Learning and Prospering in a Complex World , 2014 .

[4]  Michael J. Freedman,et al.  Aggregation and Degradation in JetStream: Streaming Analytics in the Wide Area , 2014, NSDI.

[5]  Alexandru Iosup,et al.  An Availability-on-Demand Mechanism for Datacenters , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[6]  Henri E. Bal,et al.  P^2-SWAN: Real-Time Privacy Preserving Computation for IoT Ecosystems , 2017, 2017 IEEE 1st International Conference on Fog and Edge Computing (ICFEC).

[7]  Alexandru Iosup,et al.  Procedural content generation for games: A survey , 2013, TOMCCAP.

[8]  Michael Hiltzik Big Science: Ernest Lawrence and the Invention that Launched the Military-Industrial Complex , 2015 .

[9]  Muli Ben-Yehuda,et al.  The Reservoir model and architecture for open federated cloud computing , 2009, IBM J. Res. Dev..

[10]  Douglas Thain,et al.  Report on the first workshop on negative and null results in eScience , 2017, Concurr. Comput. Pract. Exp..

[11]  Alexandru Iosup,et al.  DGSim: Comparing Grid Resource Management Architectures through Trace-Based Simulation , 2008, Euro-Par.

[12]  L. Laporte Hard Questions: What Effect Does Social Media Have on Democracy? | Facebook Newsroom , 2018 .

[13]  V PapadopoulosAlessandro,et al.  An Experimental Performance Evaluation of Autoscalers for Complex Workflows , 2018 .

[14]  Alan R. Hevner,et al.  Design Science in Information Systems Research , 2004, MIS Q..

[15]  Artur Andrzejak,et al.  Bounding the Resource Savings of Utility Computing Models , 2002 .

[16]  Hakim Weatherspoon,et al.  RACS: a case for cloud storage diversity , 2010, SoCC '10.

[17]  Alexandru Iosup,et al.  A Model for Space-Correlated Failures in Large-Scale Distributed Systems , 2010, Euro-Par.

[18]  Alexandru Iosup,et al.  POGGI: generating puzzle instances for online games on grid infrastructures , 2011, Concurr. Comput. Pract. Exp..

[19]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[20]  Dejan S. Milojicic,et al.  Open Cirrus TM cloud computing testbed: federated data centers for open source systems and services research , 2009, CloudCom 2009.

[21]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[22]  Adriana Iamnitchi,et al.  The Social Hourglass: An Infrastructure for Socially Aware Applications and Services , 2012, IEEE Internet Computing.

[23]  John Plaice Computer science is an experimental science , 1995, CSUR.

[24]  Siqi Shen,et al.  The XFire online meta-gaming network: observation and high-level analysis , 2011, 2011 IEEE International Workshop on Haptic Audio Visual Environments and Games.

[25]  S. Gould,et al.  Exaptation—a Missing Term in the Science of Form , 1982, Paleobiology.

[26]  Moshe Y. Vardi Where have all the workshops gone? , 2011, Commun. ACM.

[27]  Liviu Iftode,et al.  Infrastructures for Online Social Networking Services [Guest editorial] , 2012, IEEE Internet Comput..

[28]  Paramvir Bahl,et al.  Real-Time Video Analytics: The Killer App for Edge Computing , 2017, Computer.

[29]  Lucas D. Introna,et al.  Picturing Algorithmic Surveillance: The Politics of Facial Recognition Systems , 2002, Surveillance & Society.

[30]  Marc Snir,et al.  Computer and information science and engineering , 2011, Commun. ACM.

[31]  Chris Tofts,et al.  Death by a thousand SLAs : a short study of commercial suicide pacts , 2006 .

[32]  Alexandru Iosup,et al.  On the Performance Variability of Production Cloud Services , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[33]  Teruo Higashino,et al.  Edge-centric Computing: Vision and Challenges , 2015, CCRV.

[34]  Adriana Iamnitchi,et al.  Filecules in High-Energy Physics: Characteristics and Impact on Resource Management , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[35]  Jan Philipp Albrecht,et al.  How the GDPR Will Change the World , 2016 .

[36]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[37]  Rolf Stadler,et al.  Resource Management in Clouds: Survey and Research Challenges , 2015, Journal of Network and Systems Management.

[38]  Mahmoud Al-Ayyoub,et al.  SDDC: A Software Defined Datacenter Experimental Framework , 2015, 2015 3rd International Conference on Future Internet of Things and Cloud.

[39]  Edsger W. Dijkstra Computing Science: achievements and challenges , 1999, SIAP.

[40]  Alexandru Uta,et al.  Towards Resource Disaggregation — Memory Scavenging for Scientific Workloads , 2016, 2016 IEEE International Conference on Cluster Computing (CLUSTER).

[41]  Alexandru Iosup,et al.  Balanced resource allocations across multiple dynamic MapReduce clusters , 2014, SIGMETRICS '14.

[42]  V. Marx Biology: The big challenges of big data , 2013, Nature.

[43]  Maryanne M. Gobble,et al.  Design Thinking , 2010, The Palgrave Encyclopedia of the Possible.

[44]  Maurizio Aiello,et al.  Performance assessment and analysis of DNS tunneling tools , 2013, Log. J. IGPL.

[45]  Ada Diaconescu,et al.  The Notion of Self-aware Computing , 2017, Self-Aware Computing Systems.

[46]  Mira Mezini,et al.  Ieee Transactions on Software Engineering 1 Automated Api Property Inference Techniques , 2022 .

[47]  Fiona Fui-Hoon Nah,et al.  A study on tolerable waiting time: how long are Web users willing to wait? , 2004, AMCIS.

[48]  Alexandru Iosup,et al.  V for Vicissitude: The Challenge of Scaling Complex Big Data Workflows , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[49]  Andrew S. Tanenbaum Lessons learned from 30 years of MINIX , 2016, Commun. ACM.

[50]  Alexandru Iosup,et al.  IaaS cloud benchmarking: approaches, challenges, and experience , 2013, HotTopiCS '13.

[51]  Karin Ackermann,et al.  What Engineers Know And How They Know It , 2016 .

[52]  Alexandru Iosup,et al.  The Characteristics and Performance of Groups of Jobs in Grids , 2007, Euro-Par.

[53]  Alexandru Iosup,et al.  Self-Expressive Management of Business-Critical Workloads in Virtualized Datacenters , 2015, Computer.

[54]  Malek Ben Salem,et al.  Fog Computing: Mitigating Insider Data Theft Attacks in the Cloud , 2012, 2012 IEEE Symposium on Security and Privacy Workshops.

[55]  Alexandru Iosup,et al.  Analysis and modeling of time-correlated failures in large-scale distributed systems , 2010, 2010 11th IEEE/ACM International Conference on Grid Computing.

[56]  Peter J. Denning,et al.  The science in computer science , 2013, CACM.

[57]  Alexandru Iosup,et al.  An Empirical Performance Evaluation of GPU-Enabled Graph-Processing Systems , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[58]  Alexandru Iosup,et al.  The OpenDC Vision: Towards Collaborative Datacenter Simulation and Exploration for Everybody , 2017, 2017 16th International Symposium on Parallel and Distributed Computing (ISPDC).

[59]  Darrel C. Ince,et al.  The case for open computer programs , 2012, Nature.

[60]  Hani Jamjoom,et al.  API Harmony: Graph-based search and selection of APIs in the cloud , 2016, IBM J. Res. Dev..

[61]  Bill Howe,et al.  Virtual Appliances, Cloud Computing, and Reproducible Research , 2012, Computing in Science & Engineering.

[62]  Jim Gray,et al.  What next?: A dozen information-technology research goals , 1999, JACM.

[63]  Alexandru Iosup,et al.  Toxicity detection in multiplayer online games , 2015, 2015 International Workshop on Network and Systems Support for Games (NetGames).

[64]  P. Oscar Boykin,et al.  WOW: Self-Organizing Wide Area Overlay Networks of Virtual Workstations , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[65]  Alexandru Iosup,et al.  Scheduling Jobs in the Cloud Using On-Demand and Reserved Instances , 2013, Euro-Par.

[66]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[67]  Antony I. T. Rowstron,et al.  IOFlow: a software-defined storage architecture , 2013, SOSP.

[68]  Alexandru Iosup,et al.  2Fast : Collaborative Downloads in P2P Networks , 2006, Sixth IEEE International Conference on Peer-to-Peer Computing (P2P'06).

[69]  Jan Vitek,et al.  R3: repeatability, reproducibility and rigor , 2012, SIGP.

[70]  Gordon Lindsay Glegg The Design Of Design , 1969 .

[71]  A. Hopkins Network pharmacology: the next paradigm in drug discovery. , 2008, Nature chemical biology.

[72]  Nazareno Andrade,et al.  Labs of the World, Unite!!! , 2006, Journal of Grid Computing.

[73]  Andrei Chis,et al.  Engineering Academic Software (Dagstuhl Perspectives Workshop 16252) , 2016, Dagstuhl Reports.

[74]  Homa Bahrami,et al.  Super-flexibility for knowledge enterprises , 2010 .

[75]  Silvio Savarese,et al.  Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  A. E. Eiben,et al.  Introduction to Evolutionary Computing 2nd Edition , 2020 .

[77]  Alexandru Iosup,et al.  How Well Do Graph-Processing Platforms Perform? An Empirical Performance Evaluation and Analysis , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[78]  Alexandru Iosup,et al.  When Game Becomes Life The Creators and Spectators of Online Game Replays and Live Streaming , 2016 .

[79]  Kees Verstoep,et al.  Using Model Checking to Analyze the System Behavior of the LHC Production Grid , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[80]  Alexandru Iosup,et al.  Grid Computing Workloads , 2011, IEEE Internet Computing.

[81]  Joseph M. Hellerstein,et al.  Distributed GraphLab: A Framework for Machine Learning in the Cloud , 2012, Proc. VLDB Endow..

[82]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[83]  Liana L. Fong,et al.  Analysis and Modeling of Social Influence in High Performance Computing Workloads , 2011, Euro-Par.

[84]  Cees T. A. M. de Laat,et al.  A Medium-Scale Distributed System for Computer Science Research: Infrastructure for the Long Term , 2016, Computer.

[85]  Don Tapscott,et al.  The Digital Economy: Promise and Peril in the Age of Networked Intelligence , 2003 .

[86]  Hui Li Realistic Workload Modeling and Its Performance Impacts in Large-Scale eScience Grids , 2010, IEEE Transactions on Parallel and Distributed Systems.

[87]  Mahmoud Al-Ayyoub,et al.  Software defined cloud: Survey, system and evaluation , 2016, Future Gener. Comput. Syst..

[88]  Jennifer M. Schopf,et al.  Ten Actions When Grid Scheduling , 2004 .

[89]  Péter Kacsuk,et al.  Grid Interoperability Solutions in Grid Resource Management , 2009, IEEE Systems Journal.

[90]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[91]  Ben Y. Zhao,et al.  Beyond Social Graphs: User Interactions in Online Social Networks and their Implications , 2012, TWEB.

[92]  Alexandru Iosup,et al.  On the dynamic resource availability in grids , 2007, 2007 8th IEEE/ACM International Conference on Grid Computing.

[93]  L. Evans The Large Hadron Collider , 2012, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[94]  Walter F. Tichy,et al.  Should Computer Scientists Experiment More? , 1998, Computer.

[95]  Peter J. Denning,et al.  The Profession of IT, Who Are We -- Now? , 2011 .

[96]  Julian Dolby,et al.  Opportunities in Software Engineering Research for Web API Consumption , 2017, 2017 IEEE/ACM 1st International Workshop on API Usage and Evolution (WAPI).

[97]  K. McLeroy Thinking of Systems , 2006 .

[98]  Moshe Y. Vardi Technology has social consequences , 2011, CACM.

[99]  Alexandru Iosup,et al.  The SPEC cloud group's research vision on FaaS and serverless architectures , 2017, WOSC@Middleware.

[100]  Hilary Hutchinson,et al.  User Preference and Search Engine Latency , 2008 .

[101]  Tim Kraska,et al.  The Case for Learned Index Structures , 2018 .

[102]  Hassan Chafi,et al.  The LDBC Social Network Benchmark: Interactive Workload , 2015, SIGMOD Conference.

[103]  Dale S. Niederhauser,et al.  The Nature of Technology , 2013 .

[104]  Michael G. Wagner,et al.  On the Scientific Relevance of eSports , 2006, International Conference on Internet Computing.

[105]  D. Feitelson Experimental Computer Science: the Need for a Cultural Change , 2006 .

[106]  Kurt Mehlhorn,et al.  Publication Culture in Computing Research (Dagstuhl Perspectives Workshop 12452) , 2012, Dagstuhl Reports.

[107]  Alex C. Snoeren,et al.  Inside the Social Network's (Datacenter) Network , 2015, Comput. Commun. Rev..

[108]  Jonas Repschläger,et al.  Standardization approaches within Cloud Computing: Evaluation of infrastructure as a service architecture , 2012, 2012 Federated Conference on Computer Science and Information Systems (FedCSIS).

[109]  Steven M. Smith,et al.  Metrics for measuring ideation effectiveness , 2003 .

[110]  Srdjan Capkun,et al.  Home is safer than the cloud!: privacy concerns for consumer cloud storage , 2011, SOUPS.

[111]  Rajkumar Buyya,et al.  The Gridbus Toolkit for Grid and Utility Computing , 2003, CLUSTER.

[112]  Bo Hu,et al.  Everything as a Service (XaaS) on the Cloud: Origins, Current and Future Trends , 2015, 2015 IEEE 8th International Conference on Cloud Computing.

[113]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[114]  Pearl Brereton,et al.  Lessons from applying the systematic literature review process within the software engineering domain , 2007, J. Syst. Softw..

[115]  Alexandru Iosup,et al.  Analyzing Implicit Social Networks in Multiplayer Online Games , 2014, IEEE Internet Computing.

[116]  Chao Yang,et al.  Understanding the Market-Level and Network-Level Behaviors of the Android Malware Ecosystem , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[117]  Mabry Tyson,et al.  FRESCO: Modular Composable Security Services for Software-Defined Networks , 2013, NDSS.

[118]  Alexandru Iosup,et al.  The Grid Workloads Archive , 2008, Future Gener. Comput. Syst..

[119]  Eric Gossett,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2015 .

[120]  Jason Maassen,et al.  Real-World Distributed Computer with Ibis , 2010, Computer.

[121]  Rouven Krebs,et al.  Ready for Rain? A View from SPEC Research on the Future of Cloud Metrics , 2016, ArXiv.

[122]  Alexandru Iosup,et al.  Extending the Capabilities of Mobile Devices for Online Social Applications through Cloud Offloading , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[123]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[124]  Niklaus Wirth,et al.  A Brief History of Software Engineering , 2008, IEEE Annals of the History of Computing.

[125]  Lance Fortnow,et al.  ViewpointTime for computer science to grow up , 2009, Commun. ACM.

[126]  Olaf Spinczyk,et al.  FederatedCloudSim: a SLA-aware federated cloud simulation framework , 2014, CCB '14.

[127]  Stefan Manegold,et al.  Real-time wildfire monitoring using scientific database and linked data technologies , 2013, EDBT '13.

[128]  Craig A. Knoblock,et al.  Using a Knowledge Graph to Combat Human Trafficking , 2015, SEMWEB.

[129]  Alexandru Iosup,et al.  Dynamic Resource Provisioning in Massively Multiplayer Online Games , 2011, IEEE Transactions on Parallel and Distributed Systems.

[130]  Michael Breakspear,et al.  Graph analysis of the human connectome: Promise, progress, and pitfalls , 2013, NeuroImage.

[131]  Alexandru Iosup,et al.  Socializing by Gaming: Revealing Social Relationships in Multiplayer Online Games , 2015, TKDD.

[132]  Edsger W. Dijkstra,et al.  The humble programmer , 1972, CACM.

[133]  Sasko Ristov,et al.  Simulation of a workflow execution as a real Cloud by adding noise , 2017, Simul. Model. Pract. Theory.

[134]  R. Peng Reproducible Research in Computational Science , 2011, Science.

[135]  Xike Xie,et al.  Survey of real-time processing systems for big data , 2014, IDEAS.

[136]  Haryadi S. Gunawi,et al.  Why Does the Cloud Stop Computing?: Lessons from Hundreds of Service Outages , 2016, SoCC.

[137]  Herbert A. Simon,et al.  The Sciences of the Artificial , 1970 .

[138]  Andrew Peter Wallace McCarthy E DITOR ’ S C OMMENTS Diversity of Design Science Research , 2022 .

[139]  Daniel S. Katz,et al.  Journal of Open Source Software (JOSS): design and first-year review , 2017, PeerJ Comput. Sci..

[140]  Kees Verstoep,et al.  Property Specification Made Easy: Harnessing the Power of Model Checking in UML Designs , 2014, FORTE.

[141]  H. Kitano Systems Biology: A Brief Overview , 2002, Science.

[142]  Mahadev Satyanarayanan,et al.  The Emergence of Edge Computing , 2017, Computer.

[143]  Alexandru Iosup,et al.  How are Real Grids Used? The Analysis of Four Grid Traces and Its Implications , 2006, 2006 7th IEEE/ACM International Conference on Grid Computing.

[144]  C. Carilli,et al.  Science with the Square Kilometer Array , 2004, astro-ph/0409274.

[145]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[146]  Alexandru Iosup,et al.  Massivizing Multi-player Online Games on Clouds , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[147]  Tharam S. Dillon,et al.  Cloud Computing: Issues and Challenges , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[148]  Guofei Gu,et al.  BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection , 2008, USENIX Security Symposium.

[149]  Kees Verstoep,et al.  Bonsai: Cutting Models Down to Size , 2014, Ershov Memorial Conference.

[150]  Min Zhu,et al.  B4: experience with a globally-deployed software defined wan , 2013, SIGCOMM.

[151]  Andrew S. Tanenbaum,et al.  A brief introduction to distributed systems , 2016, Computing.

[152]  Ian Foster,et al.  DRP: Dynamic Resource Provisioning , 2006 .

[153]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[154]  Alexandru Iosup,et al.  LDBC Graphalytics: A Benchmark for Large-Scale Graph Analysis on Parallel and Distributed Platforms , 2016, Proc. VLDB Endow..

[155]  Samuel Madden,et al.  Weld: Rethinking the Interface Between Data-Intensive Applications , 2017, ArXiv.

[156]  Alexandru Iosup,et al.  Self-awareness of Cloud Applications , 2017, Self-Aware Computing Systems.

[157]  Bryan Lawson,et al.  How Designers Think , 1980 .

[158]  Alexandru Iosup,et al.  A new business model for massively multiplayer online games , 2011, ICPE '11.

[159]  Edda Klipp,et al.  Systems Biology , 1994 .

[160]  T. Ideker,et al.  A new approach to decoding life: systems biology. , 2001, Annual review of genomics and human genetics.