An empirical investigation of command-line customization

The interactive command line, also known as the shell, is a prominent mechanism used extensively by a wide range of software professionals (engineers, system administrators, data scientists, etc.). Shell customizations can therefore provide insight into the tasks they repeatedly perform, how well the standard environment supports those tasks, and ways in which the environment could be productively extended or modified. To characterize the patterns and complexities of command-line customization, we mined the collective knowledge of command-line users by analyzing more than 2.2 million shell alias definitions found on GitHub. Shell aliases allow command-line users to customize their environment by defining arbitrarily complex command substitutions. Using inductive coding methods, we found three types of aliases that each enable a number of customization practices: Shortcuts (for nicknaming commands, abbreviating subcommands, and bookmarking locations), Modifications (for substituting commands, overriding defaults, colorizing output, and elevating privilege), and Scripts (for transforming data and chaining subcommands). We conjecture that identifying common customization practices can point to particular usability issues within command-line programs, and that a deeper understanding of these practices can support researchers and tool developers in designing better user experiences. In addition to our analysis, we provide an extensive reproducibility package in the form of a curated dataset together with well-documented computational notebooks enabling further knowledge discovery and a basis for learning approaches to improve command-line workflows.

[1]  Kim Mens,et al.  Source Code-Based Recommendation Systems , 2014, Recommendation Systems in Software Engineering.

[2]  Russell Greiner,et al.  Predicting UNIX Command Lines: Adjusting to User Patterns , 2000, AAAI/IAAI.

[3]  Rishabh Singh,et al.  NoFAQ: synthesizing command repairs from examples , 2016, ESEC/SIGSOFT FSE.

[4]  Martin C. Rinard,et al.  An order-aware dataflow model for parallel Unix pipelines , 2021, Proc. ACM Program. Lang..

[5]  Lightening the Cognitive Load of Shell Programming , 2021 .

[6]  Philip J. Guo,et al.  Bespoke: Interactively Synthesizing Custom GUIs from Command-Line Applications By Demonstration , 2019, UIST.

[7]  Eric S. Raymond,et al.  The Art of Unix Programming , 2003 .

[8]  Cristina V. Lopes,et al.  Stack Overflow in Github: Any Snippets There? , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[9]  David R. Thomas,et al.  A General Inductive Approach for Analyzing Qualitative Evaluation Data , 2006 .

[10]  K. Seers Qualitative data analysis , 2011, Evidence Based Nursing.

[11]  Michael Greenberg Word expansion supports POSIX shell interactivity , 2018, Programming.

[12]  Foaad Khosmood,et al.  User identification through command history analysis , 2014, 2014 IEEE Symposium on Computational Intelligence in Cyber Security (CICS).

[13]  Daniela E. Damian,et al.  The promises and perils of mining GitHub , 2009, MSR 2014.

[14]  Barbara Carminati,et al.  Helping Users Managing Context-Based Privacy Preferences , 2019, 2019 IEEE International Conference on Services Computing (SCC).

[15]  Marian Petre,et al.  Usability Analysis of Visual Programming Environments: A 'Cognitive Dimensions' Framework , 1996, J. Vis. Lang. Comput..

[16]  Jonathan Aldrich,et al.  An Empirical Study of Object Protocols in the Wild , 2011, ECOOP.

[17]  Eli M. Dow,et al.  CLAI: A Platform for AI Skills on the Command Line , 2020, ArXiv.

[18]  Zhendong Su,et al.  An Empirical Study on Real Bug Fixes , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[19]  Ken Thompson,et al.  The UNIX time-sharing system , 1974, The Bell System Technical Journal.

[20]  Eran Yahav,et al.  Code completion with statistical language models , 2014, PLDI.

[21]  Michael S. Bernstein,et al.  Emergent, crowd-scale programming practice in the IDE , 2014, CHI.

[22]  Wei Zheng,et al.  MassConf: automatic configuration tuning by leveraging user community information , 2011, ICPE '11.

[23]  Johnny Saldaña,et al.  The Coding Manual for Qualitative Researchers , 2009 .

[24]  Georgios Gousios,et al.  Lean GHTorrent: GitHub data on demand , 2014, MSR 2014.

[25]  Evaggelia Pitoura,et al.  Managing contextual preferences , 2011, Inf. Syst..

[26]  Daniel Jackson,et al.  Purposes, concepts, misfits, and a redesign of git , 2016, OOPSLA.

[27]  Daniel Jackson,et al.  What's wrong with git?: a conceptual design analysis , 2013, Onward!.

[28]  Tianyi Zhang,et al.  Exempla gratis (E.G.): code examples for free , 2020, ESEC/SIGSOFT FSE.

[29]  Mira Mezini,et al.  Learning from examples to improve code completion systems , 2009, ESEC/SIGSOFT FSE.

[30]  Miryung Kim,et al.  Enabling Data-Driven API Design with Community Usage Data: A Need-Finding Study , 2020, CHI.

[31]  Mira Mezini,et al.  IDE 2.0: collective intelligence in software development , 2010, FoSER '10.

[32]  Michael D. Ernst,et al.  NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System , 2018, LREC.

[33]  Cheikh Talibouya Diop,et al.  Contextual preference mining for user profile construction , 2015, Inf. Syst..

[34]  Philip Levis,et al.  POSH: A Data-Aware Shell , 2020, USENIX Annual Technical Conference.

[35]  Martin Rinard,et al.  Automatic Synthesis of Parallel and Distributed Unix Commands with KumQuat , 2020, ArXiv.

[36]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[37]  Luke Church,et al.  A case of computational thinking: The subtle effect of hidden dependencies on the user experience of version control , 2014, PPIG.

[38]  Brian D. Davison,et al.  Predicting Sequences of User Actions , 1998 .

[39]  Monperrus Martin Automatic Software Repair: a Bibliography , 2020 .

[40]  Ian H. Witten,et al.  Directing the User Interface: How People Use Command-Based Computer Systems , 1988 .

[41]  Yanjun Wu,et al.  FindCmd: A personalised command retrieval tool , 2021, IET Softw..

[42]  Stephan Diehl,et al.  Usage and attribution of Stack Overflow code snippets in GitHub projects , 2018, Empirical Software Engineering.

[43]  Saul Greenberg,et al.  USING UNIX: COLLECTED TRACES OF 168 USERS , 1988 .

[44]  David A. Wagner,et al.  Contextualizing Privacy Decisions for Better Prediction (and Protection) , 2018, CHI.

[45]  Nikos Vasilakis,et al.  PaSh: light-touch data-parallel shell processing , 2020, EuroSys.

[46]  Fabio Petrillo,et al.  Software Configuration Engineering in Practice Interviews, Survey, and Systematic Literature Review , 2020, IEEE Transactions on Software Engineering.

[47]  Ahmed E. Hassan,et al.  ConfigMiner: Identifying the Appropriate Configuration Options for Config-Related User Questions by Mining Online Forums , 2021, IEEE Transactions on Software Engineering.

[48]  Christoph Treude,et al.  Categorizing the Content of GitHub README Files , 2018, Empirical Software Engineering.

[49]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[50]  Hendrik Blockeel,et al.  From Shell Logs to Shell Scripts , 2001, ILP.

[51]  I. Dey Qualitative Data Analysis: A User Friendly Guide for Social Scientists , 1993 .

[52]  Martin Monperrus,et al.  Automatic Software Repair , 2018, ACM Comput. Surv..