A language-independent static checking system for coding conventions

Despite decades of research aiming to ameliorate the difficulties of creating software, programming still remains an error-prone task. Much work in Computer Science deals with the problem of specification, or writing the right program, rather than the complementary problem of implementation, or writing the program right. However, many desirable software properties (such as portability) are obtained via adherence to coding standards, and therefore fall outside the remit of formal specification and automatic verification. Moreover, code inspections and manual detection of standards violations are time consuming. To address these issues, this thesis describes Exstatic, a novel framework for the static detection of coding standards violations. Unlike many other static checkers Exstatic can be used to examine code in a variety of languages, including program code, in-line documentation, markup languages and so on. This means that checkable coding standards adhered to by a particular project or institution can be handled by a single tool. Consequently, a major challenge in the design of Exstatic has been to invent a way of representing code from a variety of source languages. Therefore, this thesis describes ICODE, which is an intermediate language suitable for representing code from a number of different programming paradigms. To iii substantiate the claim that ICODE is a universal intermediate language, a proof strategy has been developed: for a number of different programming paradigms (imperative, declarative, etc.), a proof is constructed to show that semantics-preserving translation exists from an exemplar language (such as IMP or PCF) to ICODE. The usefulness of Exstatic has been demonstrated by the implementation of a number of static analysers for different languages. This includes a checker for technical documentation written in Javadoc which validates documents against the Sun Microsystems (now Oracle) Coding Conventions and a checker for HTML pages against a site-specific standard. A third system is targeted at a variant of the Python language, written by the author, called python-csp, based on Hoare’s Communicating Sequential Processes.

[1]  Ian J. Hayes,et al.  Specification case studies , 1987 .

[2]  G.D. Plotkin,et al.  LCF Considered as a Programming Language , 1977, Theor. Comput. Sci..

[3]  Gerard J. Holzmann,et al.  The Model Checker SPIN , 1997, IEEE Trans. Software Eng..

[4]  Maurice V. Wilkes,et al.  Memoirs of a Computer Pioneer , 1985 .

[5]  Dawson R. Engler,et al.  A system and language for building system-specific, static analyses , 2002, PLDI '02.

[6]  Robin Milner,et al.  Communication and concurrency , 1989, PHI Series in computer science.

[7]  Ecma,et al.  Common Language Infrastructure (CLI) , 2001 .

[8]  Sarah Mount,et al.  Python for Rookies , 2008 .

[9]  Edmund M. Clarke The characterization problem for Hoare logics , 1984 .

[10]  Thomas B. Steel,et al.  UNCOL: The myth and the fact , 1961 .

[11]  Greg Nelson,et al.  A generalization of Dijkstra's calculus , 1989, ACM Trans. Program. Lang. Syst..

[12]  Gregory Tassey,et al.  Prepared for what , 2007 .

[13]  Amir Pnueli,et al.  CoVaC: Compiler Validation by Program Analysis of the Cross-Product , 2008, FM.

[14]  Glynn Winskel,et al.  The formal semantics of programming languages - an introduction , 1993, Foundation of computing series.

[15]  David F. Martin,et al.  An approach to compiler correctness , 1975 .

[16]  Daniel Jackson,et al.  Aspect: a formal specification language for detecting bugs , 1992 .

[17]  Steve McConnell,et al.  Code complete - a practical handbook of software construction, 2nd Edition , 1993 .

[18]  Frank Yellin,et al.  The Java Virtual Machine Specification , 1996 .

[19]  Mike Smith Sensor Model Language (SensorML) for In-situ and Remote Sensors , 2002 .

[20]  Norman Ramsey,et al.  Source-Level Debugging for Multiple Languages with Modest Programming Effort , 2005, CC.

[21]  Robert M. Newman A visual design method and its application to high reliability hypermedia systems , 1998 .

[22]  Simon L. Peyton Jones,et al.  A user-centred approach to functions in Excel , 2003, ICFP '03.

[23]  Fred P. Brooks,et al.  The Mythical Man-Month , 1975, Reliable Software.

[24]  William R. Bush,et al.  A static analyzer for finding dynamic programming errors , 2000, Softw. Pract. Exp..

[25]  Sarah Mount,et al.  Exstatic: a generic static checker applied to documentation systems , 2004, SIGDOC '04.

[26]  Sarah Mount,et al.  SenSor: an Algorithmic Simulator for Wireless Sensor Networks , 2006 .

[27]  T. B. Steel A first version of UNCOL , 1961, IRE-AIEE-ACM '61 (Western).

[28]  K. Rustan M. Leino,et al.  Extended static checking , 1998, PROCOMET.

[29]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[30]  Stephen J. Garland,et al.  Larch: Languages and Tools for Formal Specification , 1993, Texts and Monographs in Computer Science.

[31]  Lloyd Allison,et al.  An Executable Prolog Semantics , 1983 .

[32]  Richard M. Stallman,et al.  GNU Coding Standards , 2015 .

[33]  B. Adelson Problem solving and the development of abstract categories in programming languages , 1981, Memory & cognition.

[34]  David Evans Using specifications to check source code , 1994 .

[35]  Bertrand Meyer,et al.  Advances in object-oriented software engineering , 1992 .

[36]  Carroll Morgan,et al.  Programming from specifications , 1990, Prentice Hall International Series in computer science.

[37]  Ian F. Darwin Checking C programs with lint , 1988 .

[38]  Murray Hill,et al.  Lint, a C Program Checker , 1978 .

[39]  J. van Leeuwen,et al.  Theoretical Computer Science , 2003, Lecture Notes in Computer Science.

[40]  Peter H. Welch,et al.  Process Oriented Design for Java: Concurrency for All , 2002, International Conference on Computational Science.

[41]  Daniel M. Germán,et al.  Hadez, a Framework for the Specification and Verification of Hypermedia Applications , 2000 .

[42]  George C. Necula,et al.  The design and implementation of a certifying compiler , 1998, PLDI.

[43]  Ralph-Johan Back,et al.  Refinement Calculus , 1998, Graduate Texts in Computer Science.

[44]  Wilson Ifill,et al.  PyCSP-Communicating Sequential Processes for Python , 2007 .

[45]  Cliff B. Jones,et al.  Systematic software development using VDM , 1986, Prentice Hall International Series in Computer Science.

[46]  Pamela A. Moore,et al.  Introduction to Python programming , 2011 .

[47]  Premkumar T. Devanbu,et al.  BugCache for inspections: hit or miss? , 2011, ESEC/FSE '11.

[48]  Janice Singer,et al.  How software engineers use documentation: the state of the practice , 2003, IEEE Software.

[49]  Davide Sangiorgi,et al.  Communicating and Mobile Systems: the π-calculus, , 2000 .

[50]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[51]  MeyerBertrand,et al.  Design by Contract , 1997 .

[52]  Brian Vinter,et al.  PyCSP - Communicating Sequential Processes for Python. , 2007 .

[53]  Melvin E. Conway,et al.  Proposal for an UNCOL , 1958, CACM.

[54]  Lauretta O. Osho,et al.  Axiomatic Basis for Computer Programming , 2013 .

[55]  Donald E. Knuth,et al.  Literate Programming , 1984, Comput. J..

[56]  Edsger W. Dijkstra,et al.  Cooperating sequential processes , 2002 .

[57]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[58]  Steven P. Reiss,et al.  CCEL: A Metalanguage for C++ , 1992, C++ Conference.

[59]  Nick Benton Machine Obstructed Proof How many months can it take to verify 30 assembly instructions , 2006 .

[60]  Andrew W. Appel,et al.  Modern Compiler Implementation in ML , 1997 .

[61]  C. A. R. Hoare,et al.  Communicating sequential processes , 1978, CACM.

[62]  Steve Maguire,et al.  Writing Solid Code , 1993 .

[63]  C. A. R. Hoare,et al.  The verifying compiler: A grand challenge for computing research , 2003, JACM.

[64]  John W. Backus,et al.  Can programming be liberated from the von Neumann style?: a functional style and its algebra of programs , 1978, CACM.

[65]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[66]  Jeremy Malcolm Randolph Martin,et al.  The design and construction of deadlock-free concurrent systems , 1996 .

[67]  Edsger W. Dijkstra,et al.  A Discipline of Programming , 1976 .

[68]  Flemming Nielson,et al.  Principles of Program Analysis , 1999, Springer Berlin Heidelberg.

[69]  Marc Michael Brandis Optimizing compilers for structured programming languages , 1995 .

[70]  Yang Meng Tan,et al.  LCLint: a tool for using specifications to check code , 1994, SIGSOFT '94.

[71]  Harvey P. Siy,et al.  Does the modern code inspection have value? , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[72]  dizayn İç dekor Design by Contract , 2010 .

[73]  Donald M. Leslie Using Javadoc and XML to produce API reference documentation , 2002, SIGDOC '02.

[74]  Matthew B. Dwyer,et al.  Bandera: extracting finite-state models from Java source code , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[75]  Dawson R. Engler,et al.  Checking system rules using system-specific, programmer-written compiler extensions , 2000, OSDI.

[76]  Xiaoyan Zhu,et al.  An empirical analysis of the FixCache algorithm , 2011, MSR '11.

[77]  Stephen C. Johnson A portable compiler: theory and practice , 1978, POPL.

[78]  Edsger W. Dijkstra,et al.  On the Interplay between Mathematics and Programming , 1978, Program Construction.

[79]  Charles Gregory Nelson,et al.  Techniques for program verification , 1979 .

[80]  Owen R. Mock,et al.  The problem of programming communication with changing machines: a proposed solution , 1958, CACM.

[81]  J. Davenport Editor , 1960 .

[82]  J. Michael Spivey,et al.  The Z notation - a reference manual , 1992, Prentice Hall International Series in Computer Science.

[83]  Philip A. Nelson,et al.  A comparison of PASCAL intermediate languages , 1979, SIGPLAN '79.

[84]  Robin Milner,et al.  Communicating and mobile systems - the Pi-calculus , 1999 .

[85]  Stavros Macrakis,et al.  From UNCOL to ANDF: Progress in Standard Intermediate Languages , 2015 .

[86]  Sarah Mount,et al.  CSP as a Domain-Specific Language Embedded in Python and Jython , 2009, CPA.

[87]  William M. Waite,et al.  Experience with the universal intermediate language janus , 1978, Softw. Pract. Exp..

[88]  A. Church The calculi of lambda-conversion , 1941 .

[89]  Andrew D. Gordon,et al.  Typing a multi-language intermediate code , 2001, POPL '01.

[90]  Karl N. Levitt,et al.  SELECT—a formal system for testing and debugging programs by symbolic execution , 1975 .

[91]  Sarah Mount,et al.  A simulation tool for system services in ad-hoc wireless sensor networks , 2005 .

[92]  Martin Richards The portability of the BCPL compiler , 1971, Softw. Pract. Exp..

[93]  B. Adelson When Novices Surpass Experts: The Difficulty of a Task May Increase With Expertise , 1984 .

[94]  Lloyd Allison Programming Denotational Semantics , 1983, Comput. J..