Having Fun With 31.521 Shell Scripts

Statically parsing shell scripts is, due to various peculiarities of the shell language, a challenge. One of the difficulties is that the shell language is designed to be executed by intertwining reading chunks of syntax with semantic actions. We have analyzed a corpus of 31.521 POSIX shell scripts occurring as maintainer scripts in the Debian GNU/Linux distribution. Our parser, which makes use of recent developments in parser generation technology, succeeds on 99.9% of the corpus. The architecture of our tool allows us to easily plug in various statistical analyzers on the syntax trees constructed from the shell scripts. The statistics obtained by our tool are the basis for the definition of a model which we plan to use in the future for the formal verification of scripts.

[1]  Roberto Di Cosmo,et al.  Managing the Complexity of Large Free and Open Source Package-Based Software Distributions , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[2]  Roberto Di Cosmo,et al.  Supporting software evolution in component-based FOSS systems , 2011, Sci. Comput. Program..

[3]  François Pottier,et al.  Reachability and error diagnosis in LR(1) parsers , 2016, CC.

[4]  Cyrille Artho,et al.  Why do software packages conflict? , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).

[5]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools (2nd Edition) , 2006 .

[6]  Alexander Aiken,et al.  Static Detection of Security Vulnerabilities in Scripting Languages , 2006, USENIX Security Symposium.

[7]  Steve Zdancewic,et al.  ABASH: finding bugs in bash scripts , 2007, PLAS '07.

[8]  Roberto Di Cosmo,et al.  Maintaining large software distributions: new challenges from the FOSS era. , 2006 .

[9]  Davide Di Ruscio,et al.  Towards maintainer script modernization in FOSS distributions , 2009, IWOCE '09.

[10]  Roberto Di Cosmo,et al.  Mining Component Repositories for Installability Issues , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[11]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .