论文信息 - Stochastic -Model -Driven Adaptation and Recovery in Distributed Systems - 字舞流文

Stochastic -Model -Driven Adaptation and Recovery in Distributed Systems

Kaustubh Joshi | K. Joshi

[1] Karsten Schwan,et al. E2EProf: Automated End-to-End Performance Management for Enterprise Systems , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[2] George E. Monahan,et al. A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .

[3] Matti A. Hiltunen,et al. Adaptive Distributed and Fault-Tolerant Systems , 2007 .

[4] Sang Hyuk Son,et al. Feedback Control Architecture and Design Methodology for Service Delay Guarantees in Web Servers , 2006, IEEE Transactions on Parallel and Distributed Systems.

[5] Marcos K. Aguilera,et al. WAP5: black-box performance debugging for wide-area systems , 2006, WWW '06.

[6] Architecture-based autonomous repair management: an application to J2EE clusters , 2005, 24th IEEE Symposium on Reliable Distributed Systems (SRDS'05).

[7] Yuanyuan Zhou,et al. Rx: treating bugs as allergies---a safe method to survive software failures , 2005, SOSP '05.

[8] Albert G. Greenberg,et al. IP fault localization via risk modeling , 2005, NSDI.

[9] Michael L. Littman,et al. An Instance-Based State Representation for Network Repair , 2004, AAAI.

[10] George Candea,et al. Microreboot - A Technique for Cheap Recovery , 2004, OSDI.

[11] Mike Chen,et al. Failure diagnosis using decision trees , 2004, International Conference on Autonomic Computing, 2004. Proceedings..

[12] Sui Ruan,et al. On multi-mode test sequencing problem , 2003, Proceedings AUTOTESTCON 2003. IEEE Systems Readiness Technology Conference..

[13] Marcos K. Aguilera,et al. Performance debugging for distributed systems of black boxes , 2003, SOSP '03.

[14] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[15] Rittwik Jana,et al. iMobile EE – An Enterprise Mobile Service Platform , 2003, Wirel. Networks.

[16] Anne Condon,et al. On the undecidability of probabilistic planning and related stochastic optimization problems , 2003, Artif. Intell..

[17] George Candea,et al. JAGR: an autonomous self-recovering application server , 2003, 2003 Autonomic Computing Workshop.

[18] George Candea,et al. Automatic failure-path inference: a generic introspection technique for Internet applications , 2003, Proceedings the Third IEEE Workshop on Internet Applications. WIAPP 2003.

[19] Archana Ganapathi,et al. Why Do Internet Services Fail, and What Can Be Done About It? , 2002, USENIX Symposium on Internet Technologies and Systems.

[20] Willy Zwaenepoel,et al. Performance and scalability of EJB applications , 2002, OOPSLA '02.

[21] Chenyang Lu,et al. An adaptive control framework for QoS guarantees and its application to differentiated caching , 2002, IEEE 2002 Tenth IEEE International Workshop on Quality of Service (Cat. No.02EX564).

[22] Chenyang Lu,et al. ControlWare: a middleware architecture for feedback control of software performance , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[23] George Candea,et al. Reducing recovery time in a small recursively restartable system , 2002, Proceedings International Conference on Dependable Systems and Networks.

[24] Noah Treuhaft,et al. ROC-1: Hardware Support for Recovery-Oriented Computing , 2002, IEEE Trans. Computers.

[25] K. Shin,et al. Performance Guarantees for Web Server End-Systems: A Control-Theoretical Approach , 2002, IEEE Trans. Parallel Distributed Syst..

[26] Aaron B. Brown,et al. An active approach to characterizing dynamic dependencies for problem determination in a distributed environment , 2001, 2001 IEEE/IFIP International Symposium on Integrated Network Management Proceedings. Integrated Network Management VII. Integrated Management Strategies for the New Millennium (Cat. No.01EX470).

[27] Joseph L. Hellerstein,et al. Using Control Theory to Achieve Service Level Objectives In Performance Management , 2001, 2001 IEEE/IFIP International Symposium on Integrated Network Management Proceedings. Integrated Network Management VII. Integrated Management Strategies for the New Millennium (Cat. No.01EX470).

[28] Tarek F. Abdelzaher,et al. Differentiated caching services; a control-theoretical approach , 2001, Proceedings 21st International Conference on Distributed Computing Systems.

[29] William LeFebvre,et al. CNN.com: Facing a World Crisis , 2001, LiSA.

[30] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..

[31] Klara Nahrstedt,et al. A control-based middleware framework for quality-of-service adaptations , 1999, IEEE J. Sel. Areas Commun..

[32] Miguel Oom Temudo de Castro,et al. Practical Byzantine fault tolerance , 1999, OSDI '99.

[33] Calton Pu,et al. A feedback-driven proportion allocator for real-rate scheduling , 1999, OSDI '99.

[34] Priya Narasimhan,et al. Transparent fault tolerance for corba , 1999 .

[35] William H. Sanders,et al. AQuA: an adaptive architecture that provides dependable distributed objects , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).

[36] Calton Pu,et al. SWiFT: a feedback control and dynamic reconfiguration toolkit , 1998 .

[37] Saurabh Bagchi,et al. Chameleon: a software infrastructure for adaptive fault tolerance , 1998, Proceedings. IEEE International Computer Performance and Dependability Symposium. IPDS'98 (Cat. No.98TB100248).

[38] Hermann de Meer,et al. Controlled Stochastic Petri Nets , 1997, SRDS.

[39] Richard Washington,et al. BI-POMDP: Bounded, Incremental, Partially-Observable Markov-Model Planning , 1997, ECP.

[40] Milos Hauskrecht,et al. Incremental Methods for Computing Bounds in Partially Observable Markov Decision Processes , 1997, AAAI/IAAI.

[41] Algirdas Avizienis,et al. Toward Systematic Design of Fault-Tolerant Systems , 1997, Computer.

[42] Jean-Claude Laprie,et al. Dependable computing: concepts, limits, challenges , 1995 .

[43] Yennun Huang,et al. A software fault tolerance platform , 1995 .

[44] Philip Heidelberger,et al. Fast simulation of rare events in queueing and reliability models , 1993, TOMC.

[45] D. Powell,et al. The Delta-4 Approach to Dependability in Open Distributed Computing Systems , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ' Highlights from Twenty-Five Years'..

[46] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[47] Kishor S. Trivedi,et al. Guarded Repair of Dependable Systems , 1994, Theor. Comput. Sci..

[48] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[49] Andrzej Pelc,et al. Diagnosis and Repair in Multiprocessor Systems , 1993, IEEE Trans. Computers.

[50] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.

[51] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[52] Kang G. Shin,et al. Optimal Dynamic Control of Resources in a Distributed System , 1989, IEEE Transactions on Software Engineering.

[53] Jeffrey D. Case,et al. Simple Network Management Protocol (SNMP) , 1989, RFC.

[54] Sape Mullender,et al. Distributed systems , 1989 .

[55] Jim Gray,et al. Why Do Computers Stop and What Can Be Done About It? , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[56] Nancy A. Lynch,et al. Impossibility of distributed consensus with one faulty process , 1985, JACM.

[57] Richard D. Schlichting,et al. Fail-stop processors: an approach to designing fault-tolerant computing systems , 1983, TOCS.

[58] Leslie Lamport,et al. The Byzantine Generals Problem , 1982, TOPL.

[59] Karl N. Levitt,et al. The design, analysis, and verification of the SIFT fault tolerant system , 1976, ICSE '76.

[60] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .

[61] W. C. Carter,et al. Reliability modeling techniques for self-repairing computer systems , 1969, ACM '69.

[62] GERNOT METZE,et al. On the Connection Assignment Problem of Diagnosable Systems , 1967, IEEE Trans. Electron. Comput..

[63] Algirdas Avizienis,et al. Design of fault-tolerant computers , 1967, AFIPS '67 (Fall).