Human, bot or both? A study on the capabilities of classification models on mixed accounts

Several bot detection algorithms have recently been discussed in the literature, as software bots that perform maintenance tasks have become more popular in recent years. State-of-the-art techniques detect bots based on a binary classification, where a GitHub account is either a human or a bot. However, this conceptualisation of bot detection as an account-level binary classification problem fails to account for ‘mixed accounts’, accounts that are shared between a human and a bot, and that therefore exhibit both bot and human activity. By using binary classification models for bot detection, researchers might hence mischaracterize both human and bot behavior in software maintenance. This calls for conceptualisation of bot detection through a comment-level classification. However, the single such approach solely investigates a small number of mixed account comments. The nature of mixed accounts on GitHub is thus yet unknown, and the absence of appropriate datasets make this a difficult problem to study. In this paper, we investigate three comment-level classification models and we evaluate these classifiers on a manually labeled dataset of mixed accounts. We find that the best classifiers based on these classification models achieve a precision and recall between 88% and 96%. However, even the most accurate comment-level classifier cannot accurately detect mixed accounts; rather, we find that textual content alone, or textual content combined with templates used by bots, are very effective features for the detection of both bot and mixed accounts. Our study calls for more accurate bot detection techniques capable of identifying mixed accounts, and as such supporting more refined insights in software maintenance activities performed by humans and bots on social coding sites.