A Domain-Specific Language for Application-Level Checkpointing

Checkpointing is one of the key requirements for writing fault-tolerant and flexible applications for dynamic and distributed environments like the Grid. Certain patterns are observed in the implementation of the application-level Checkpointing and Restart (CaR) mechanism across myriad of applications. These patterns indicate that a higher level of abstraction can be used to isolate the observed commonalities and variations in the CaR mechanism. This research paper describes an approach for the design and development of a Domain-Specific Language (DSL) for abstracting the application-level CaR mechanism. The specifications written in the DSL are used for semi-automatically generating the application-specific code for the CaR mechanism. This DSL not only provides a high-level of abstraction but also promotes code reuse, code correctness and non-invasive reengineering of legacy applications to embed the CaR mechanism in them.

[1]  M Mernik,et al.  When and how to develop domain-specific languages , 2005, CSUR.

[2]  Jean Bézivin,et al.  KM3: A DSL for Metamodel Specification , 2006, FMOODS.

[3]  Satoshi Matsuoka,et al.  ECOOP'97 — Object-Oriented Programming , 1997, Lecture Notes in Computer Science.

[4]  Jean Bézivin,et al.  TCS:: a DSL for the specification of textual concrete syntaxes in model engineering , 2006, GPCE '06.

[5]  Wei-Ying Ma,et al.  Image and Video Retrieval , 2003, Lecture Notes in Computer Science.

[6]  Krzysztof Czarnecki,et al.  Generative programming - methods, tools and applications , 2000 .

[7]  Gregor Kiczales,et al.  Aspect-oriented programming , 1996, CSUR.

[8]  Chengcui Zhang,et al.  Region-Based Image Clustering and Retrieval Using Multiple Instance Learning , 2005, CIVR.

[9]  Cristina V. Lopes,et al.  Aspect-oriented programming , 1999, ECOOP Workshops.

[10]  John Paul Walters,et al.  Application-Level Checkpointing Techniques for Parallel Programs , 2006, ICDCIT.

[11]  Purushotham Bangalore,et al.  Using Aspect-Oriented Programming for Checkpointing a Parallel Application , 2008, PDPTA.

[12]  David S. Wile Lessons learned from real DSL experiments , 2004, Sci. Comput. Program..

[13]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[14]  Peter K. Szwed,et al.  Application-level checkpointing for shared memory programs , 2004, ASPLOS XI.

[15]  Frédéric Jouault,et al.  Transforming Models with ATL , 2005, MoDELS.

[16]  Jean Bézivin,et al.  Model-based DSL frameworks , 2006, OOPSLA '06.

[17]  Jason Duell,et al.  The design and implementation of Berkeley Lab's linuxcheckpoint/restart , 2005 .

[18]  Ira D. Baxter,et al.  Design maintenance systems , 1991, CACM.

[19]  A. Urbaniak,et al.  Towards Easy-to-Use Checkpointing of MPI Applications within CLUSTERIX , 2004 .