Online Fault and Anomaly Detection for Large-Scale Scientific Workflows
暂无分享,去创建一个
Ewa Deelman | Gideon Juve | Gaurang Mehta | Dan Gunter | Karan Vahi | Fabio Silva | Taghrid Samak | Monte Goode | D. Gunter | E. Deelman | G. Mehta | K. Vahi | G. Juve | T. Samak | M. Goode | Fabio Silva | Gaurang Mehta
[1] Ian J. Taylor,et al. Workflows and e-Science: An overview of workflow system features and capabilities , 2009, Future Gener. Comput. Syst..
[2] Daniel A. Reed,et al. Analysis of application heartbeats: Learning structural and temporal features in time series data for identification of performance problems , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[3] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .
[4] Akinori Yonezawa,et al. ParaTrac: a fine-grained profiler for data-intensive workflows , 2010, HPDC '10.
[5] Brian Tierney,et al. Scalable Analysis of Distributed Workflow Traces , 2005, PDPTA.
[6] Cheng-Zhong Xu,et al. Exploring event correlation for failure prediction in coalitions of clusters , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[7] Yang Zhang,et al. Combined Fault Tolerance and Scheduling Techniques for Workflow Applications on Computational Grids , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.
[8] Junwei Cao,et al. A Case Study on the Use of Workflow Technologies for Scientific Analysis: Gravitational Wave Data Analysis , 2007, Workflows for e-Science, Scientific Workflows for Grids.
[9] Michael Wilde,et al. Kickstarting remote applications , 2006 .
[10] Lavanya Ramakrishnan,et al. WORKEM: Representing and Emulating Distributed Scientific Workflow Execution State , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.
[11] Qian Zhu,et al. Power-Aware Consolidation of Scientific Workflows in Virtualized Environments , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[12] J. A. Hartigan,et al. A k-means clustering algorithm , 1979 .
[13] Brian Tierney,et al. Log summarization and anomaly detection for troubleshooting distributed systems , 2007, 2007 8th IEEE/ACM International Conference on Grid Computing.
[14] David G. Stork,et al. Pattern Classification , 1973 .
[15] Daniel S. Katz,et al. Montage: a grid-enabled engine for delivering custom science-grade mosaics on demand , 2004, SPIE Astronomical Telescopes + Instrumentation.
[16] Steve Vinoski,et al. Advanced Message Queuing Protocol , 2006, IEEE Internet Computing.
[17] Allan Snavely,et al. A simulation toolkit to investigate the effects of grid characteristics on workflow completion time , 2009, WORKS '09.
[18] Thomas Fahringer,et al. Predicting the execution time of grid workflow applications through local learning , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[19] Ran Wolff,et al. Mining for misconfigured machines in grid systems , 2006, KDD '06.
[20] David A. Cieslak,et al. Troubleshooting thousands of jobs on production grids using data mining techniques , 2008, 2008 9th IEEE/ACM International Conference on Grid Computing.
[21] Dennis Gannon,et al. Workflows for e-Science, Scientific Workflows for Grids , 2014 .
[22] Chuang Liu,et al. Anomaly detection and diagnosis in grid environments , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[23] Bianca Schroeder,et al. A Large-Scale Study of Failures in High-Performance Computing Systems , 2010, IEEE Trans. Dependable Secur. Comput..