Utilizing linguistically enhanced keystroke dynamics to predict typist cognition and demographics

Entering information on a computer keyboard is a ubiquitous mode of expression and communication. We investigate whether typing behavior is connected to two factors: the cognitive demands of a given task and the demographic features of the typist. We utilize features based on keystroke dynamics, stylometry, and "language production", which are novel hybrid features that capture the dynamics of a typists linguistic choices. Our study takes advantage of a large data set (~350 subjects) made up of relatively short samples (~450 characters) of free text. Experiments show that these features can recognize the cognitive demands of task that an unseen typist is engaged in, and can classify his or her demographics with better than chance accuracy. We correctly distinguish High vs. Low cognitively demanding tasks with accuracy up to 72.39%. Detection of non-native speakers of English is achieved with F1=0.462 over a baseline of 0.166, while detection of female typists reaches F1=0.524 over a baseline of 0.442. Recognition of left-handed typists achieves F1=0.223 over a baseline of 0.100. Further analyses reveal that novel relationships exist between language production as manifested through typing behavior, and both cognitive and demographic factors. HighlightsRecognition of cognitive task with linguistic and keystroke features with accuracy of 72.39%.Recognition of gender, handedness, and native-language from short unconstrained text at F1=.462, 0.223, and 0.524, respectively.Developed novel Language Production features hybridizing keystroke dynamics and stylometry.

[1]  Bojan Cukic,et al.  Evaluating the Reliability of Credential Hardening through Keystroke Dynamics , 2006, 2006 17th International Symposium on Software Reliability Engineering.

[2]  Yejin Choi,et al.  Gender Attribution: Tracing Stylometric Evidence Beyond Topic and Genre , 2011, CoNLL.

[3]  Roy A. Maxion,et al.  Why Did My Detector Do That?! - Predicting Keystroke-Dynamics Error Rates , 2010, RAID.

[4]  Giancarlo Ruffo,et al.  Keystroke Analysis of Different Languages: A Case Study , 2005, IDA.

[5]  David Yarowsky,et al.  Stylometric Analysis of Scientific Articles , 2012, NAACL.

[6]  Shlomo Argamon,et al.  Automatically Categorizing Written Texts by Author Gender , 2002, Lit. Linguistic Comput..

[7]  Fabian Monrose,et al.  Authentication via keystroke dynamics , 1997, CCS '97.

[8]  Charles C. Tappert,et al.  A Stylometry System for Authenticating Students Taking Online Tests , 2011 .

[9]  Rajarathnam Chandramouli,et al.  Gender identification from E-mails , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[10]  ZhouLina,et al.  Automated stress detection using keystroke and linguistic features , 2009 .

[11]  P. van Beukelen,et al.  Developing a classification tool based on Bloom's taxonomy to assess the cognitive level of short essay questions. , 2004, Journal of veterinary medical education.

[12]  Andrew Sears,et al.  Automated stress detection using keystroke and linguistic features: An exploratory study , 2009, Int. J. Hum. Comput. Stud..

[13]  Fabian Monrose,et al.  Keystroke dynamics as a biometric for authentication , 2000, Future Gener. Comput. Syst..

[14]  Claudia Picardi,et al.  User authentication through keystroke dynamics , 2002, TSEC.

[15]  Efstathios Stamatatos A survey of modern authorship attribution methods , 2009 .

[16]  Sudeshna Sarkar,et al.  Stylometric Analysis of Bloggers' Age and Gender , 2009, ICWSM.

[17]  Regan L. Mandryk,et al.  Identifying emotional states using keystroke dynamics , 2011, CHI.

[18]  George M. Mohay,et al.  Language and Gender Author Cohort Analysis of E-mail for Computer Forensics , 2002 .

[19]  Christophe Rosenberger,et al.  Soft Biometrics for Keystroke Dynamics , 2013, ICIAR.

[20]  Christophe Rosenberger,et al.  Soft biometrics for keystroke dynamics: Profiling individuals while typing passwords , 2014, Comput. Secur..

[21]  Michael A. Covington,et al.  Cutting the Gordian Knot: The Moving-Average Type–Token Ratio (MATTR) , 2010, J. Quant. Linguistics.

[22]  Veerle M. Baaijen,et al.  Keystroke Analysis , 2012 .

[23]  Kiran S. Balagani,et al.  Investigating Cognitive Rhythms as a New Modality for Continuous Authentication , 2013 .

[24]  Patrick Juola,et al.  Authorship Attribution , 2008, Found. Trends Inf. Retr..

[25]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[26]  Amela Karahasanovic,et al.  An Investigation into Keystroke Latency Metrics as an Indicator of Programming Performance , 2005, ACE.

[27]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[28]  Dawn Xiaodong Song,et al.  Timing Analysis of Keystrokes and Timing Attacks on SSH , 2001, USENIX Security Symposium.

[29]  Sung-Hyuk Cha,et al.  Keystroke Biometric Recognition Studies on Long-Text Input under Ideal and Application-Oriented Conditions , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[30]  Vir V. Phoha,et al.  Continuous authentication with cognition-centric text production and revision features , 2014, IEEE International Joint Conference on Biometrics.

[31]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[32]  Benjamin S. Bloom,et al.  A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives , 2000 .

[33]  Gopal K. Gupta,et al.  Identity authentication based on keystroke latencies , 1990, Commun. ACM.

[34]  Christophe Rosenberger,et al.  GREYC keystroke: A benchmark for keystroke dynamics biometric systems , 2009, 2009 IEEE 3rd International Conference on Biometrics: Theory, Applications, and Systems.

[35]  Fazli Can,et al.  Change of Writing Style with Time , 2004, Comput. Humanit..

[36]  D. Holmes The Analysis of Literary Style — a Review , 1985 .

[37]  Roy A. Maxion,et al.  Comparing anomaly-detection algorithms for keystroke dynamics , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[38]  R. Paul,et al.  Critical Thinking: What Every Person Needs To Survive in a Changing World , 1991 .

[39]  Efstathios Stamatatos,et al.  Computer-Based Authorship Attribution Without Lexical Measures , 2001, Comput. Humanit..

[40]  Martin Ganco Cutting the Gordian knot: The effect of knowledge complexity on employee mobility and entrepreneurship , 2013 .

[41]  Sung-Hyuk Cha,et al.  An investigation of keystroke and stylometry traits for authenticating online test takers , 2011, 2011 International Joint Conference on Biometrics (IJCB).

[42]  Hinrich Schütze,et al.  Automatic Detection of Text Genre , 1997, ACL.

[43]  Marjory Da Costa-Abreu,et al.  Using keystroke dynamics for gender identification in social network environment , 2011, ICDP.

[44]  Christophe Rosenberger,et al.  A new soft biometric approach for keystroke dynamics based on gender recognition , 2012, Int. J. Inf. Technol. Manag..

[45]  Yong Sheng,et al.  A parallel decision tree-based method for user authentication based on keystroke patterns , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[46]  Jack Grieve,et al.  Quantitative Authorship Attribution: An Evaluation of Techniques , 2007, Lit. Linguistic Comput..

[47]  Elizabeth Shriberg,et al.  SVM modeling of "SNERF-grams" for speaker recognition , 2004, INTERSPEECH.

[48]  Olivier de Vel,et al.  Mining E-mail Authorship , 2000 .

[49]  Claudia Picardi,et al.  Keystroke analysis of free text , 2005, TSEC.

[50]  Claudia Picardi,et al.  Identity verification through dynamic keystroke analysis , 2003, Intell. Data Anal..