An Instant Messaging Intrusion Detection System Framework: Using character frequency analysis for authorship identification and validation

The medium of instant messaging (IM) is a well-established means of fast and effective communication. However, a framework for analysis of instant messaging has gone largely unexplored until now. This paper explores instant messaging authorship identification and validation in terms of an author profiling framework and an anomaly-based intrusion detection system (IDS). The framework includes author behavior categories, which are the set of characteristics that remain relatively constant for a large number of messages written by the author. Specific topics include user pattern analysis, user profiling, categorization, computational linguistics, data mining, and anomaly detection. The experiments focus on applying character frequency analysis to IM messages for authorship identification and validation. This addresses the questions; can we identify an author of an IM conversation based strictly on user behavior, do different conversations with a single user look similar, do conversations with different users look different, and what is the demarcation between similar and different? Another experiment focuses on applying an instance-based learning algorithm to the character frequency of IM user messages for authorship identification and validation. The experiment applies the nearest-neighbor classification method to classify messages. It also calculates a degree of confidence to validate the identity of the IM user