Leveraging Crowdsourced Technical Documentation: Building a Command Thesaurus

Since its inception, the Internet has enabled motivated members of an application’s user base to compose and self-publish technical documentation, manuals and tutorials. These distributed acts of self-publishing can be thought of as the implicit crowdsourcing of technical support. In this paper, we leverage user-generated documentation to construct what we call a “command thesaurus”. A command thesaurus groups together semantically related words, bridging the gap between the vocabulary expressed by users and the (sometimes highly technical) terminology employed by software applications. In this work, we outline one potential approach for the automatic generation of a command thesaurus, and we present some initial experiments suggesting that the proposed approach is feasible. We then conclude by describing various compelling applications of these newly generated resources. In particular, command thesauri may find use in search-driven interfaces, and in tools that translate tutorials from one application to another.