Information security applications of natural language processing techniques

In this thesis we investigate applications of natural language processing (NLP) techniques to information security problems. We present our results in this direction for two important areas: password authentication, and information hiding in natural language text. We have limited this thesis to the realm of language engineering, i.e., our emphasis is on adapting the existing NLP techniques for our purposes, rather than in developing new NLP techniques. Our password mnemonics system helps users to remember random passwords, hence making it possible to implement organizational policies that mandate strong password choices by users. Moreover, in our system password changes do not necessitate a new mnemonic, thereby further easing the users' task of memorizing their respective mnemonics. Our robust natural language text watermarking system can avoid the removal of the watermark text by an automated adversary, in the same way used by authentication systems to avoid an automated adversary's compromise of the password string hidden within the password mnemonic. We have also laid the groundwork for followup research in this area.

[1]  Dilek Z. Hakkani-Tür,et al.  Natural language watermarking: challenges in building a practical system , 2006, Electronic Imaging.

[2]  Jessica J. Fridrich,et al.  Efficient Wet Paper Codes , 2005, Information Hiding.

[3]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[4]  Julie Thorpe,et al.  Graphical Dictionaries and the Memorable Space of Graphical Passwords , 2004, USENIX Security Symposium.

[5]  G. Bower Analysis of a mnemonic device , 1970 .

[6]  Mikhail J. Atallah,et al.  Translation-based steganography , 2005, J. Comput. Secur..

[7]  Philip Resnik,et al.  Selectional Preference and Sense Disambiguation , 1997 .

[8]  Stefan Katzenbeisser,et al.  Towards Human Interactive Proofs in the Text-Domain (Using the Problem of Sense-Ambiguity for Security) , 2004, ISC.

[9]  Jakob Nielsen,et al.  Usability engineering , 1997, The Computer Science and Engineering Handbook.

[10]  Mikhail J. Atallah,et al.  Passwords decay, words endure: secure and re-usable multiple password mnemonics , 2007, SAC '07.

[11]  Udi Manber,et al.  A simple scheme to make passwords based on one-way functions much harder to crack , 1996, Comput. Secur..

[12]  Sacha Brostoff,et al.  Transforming the ‘Weakest Link’ — a Human/Computer Interaction Approach to Usable and Effective Security , 2001 .

[13]  E Donchin,et al.  Brain-computer interface technology: a review of the first international meeting. , 2000, IEEE transactions on rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society.

[14]  Julie Thorpe,et al.  Towards secure design choices for implementing graphical passwords , 2004, 20th Annual Computer Security Applications Conference.

[15]  Umut Topkara,et al.  Have the cake and eat it too - infusing usability into text-password based authentication systems , 2005, 21st Annual Computer Security Applications Conference (ACSAC'05).

[16]  George A. Miller,et al.  Human memory and the storage of information , 1956, IRE Trans. Inf. Theory.

[17]  Jayant R. Haritsa,et al.  A Framework for High-Accuracy Privacy-Preserving Mining , 2005, ICDE.

[18]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[19]  C. Lebiere,et al.  The Atomic Components of Thought , 1998 .

[20]  Helmut Schneider,et al.  The domino effect of password reuse , 2004, CACM.

[21]  Michael K. Reiter,et al.  On User Choice in Graphical Password Schemes , 2004, USENIX Security Symposium.

[22]  A. One,et al.  Smashing The Stack For Fun And Profit , 1996 .

[23]  Anind K. Dey,et al.  Web accessibility for low bandwidth input , 2002, ASSETS.

[24]  M. Angela Sasse,et al.  Users are not the enemy , 1999, CACM.

[25]  Edward J. Delp,et al.  Attacks on lexical natural language steganography systems , 2006, Electronic Imaging.

[26]  Adrian Perrig,et al.  This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. Déjà Vu: A User Study Using Images for Authentication , 2000 .

[27]  Joseph A. O'Sullivan,et al.  Information-theoretic analysis of information hiding , 2003, IEEE Trans. Inf. Theory.

[28]  Dawn Song,et al.  Hash Visualization: a New Technique to improve Real-World Security , 1999 .

[29]  Radu Sion Power: a metric for evaluating watermarking algorithms , 2002, Proceedings. International Conference on Information Technology: Coding and Computing.

[30]  William Z Rymer,et al.  Brain-computer interface technology: a review of the Second International Meeting. , 2003, IEEE transactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society.

[31]  Radu Sion,et al.  Rights protection for discrete numeric streams , 2006, IEEE Transactions on Knowledge and Data Engineering.

[32]  Edward J. Delp,et al.  Natural language watermarking , 2005, IS&T/SPIE Electronic Imaging.

[33]  Ann Copestake Applying Natural Language Processing Techniques to Speech Prostheses , 1996 .

[34]  Hugo Liu,et al.  Makebelieve: using commonsense knowledge to generate stories , 2002, AAAI/IAAI.

[35]  G. Bower,et al.  Comprehension and memory for pictures , 1975, Memory & cognition.

[36]  Shari Trewin,et al.  Physical usability and the mobile web , 2006, W4A '06.

[37]  Lorrie Faith Cranor,et al.  Human selection of mnemonic phrase-based passwords , 2006, SOUPS '06.

[38]  Richard Bergmair,et al.  Towards Linguistic Steganography: A Systematic Investigation of Approaches, Systems, and Issues , 2004 .

[39]  J. Yan,et al.  Password memorability and security: empirical results , 2004, IEEE Security & Privacy Magazine.

[40]  H. Kucera,et al.  Computational analysis of present-day American English , 1967 .

[41]  Ross J. Anderson Why cryptosystems fail , 1994, CACM.

[42]  Daniel Klein,et al.  Foiling the cracker: A survey of, and improvements to, password security , 1992 .

[43]  David C. Feldmeier,et al.  UNIX Password Security - Ten Years Later , 1989, CRYPTO.

[44]  Edward W. Felten,et al.  Password management strategies for online accounts , 2006, SOUPS '06.

[45]  A. Sterrett On the Detection of Defective Members of Large Populations , 1957 .

[46]  G. Bower,et al.  Mnemonic elaboration in multilist learning , 1972 .

[47]  A. Paivio,et al.  Why are pictures easier to recall than words? , 1968 .

[48]  Margrit Betke,et al.  Communication via eye blinks and eyebrow raises: video-based human-computer interfaces , 2003, Universal Access in the Information Society.

[49]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[50]  Christiane Fellbaum,et al.  Building Semantic Concordances , 1998 .

[51]  Benny Pinkas,et al.  Securing passwords against dictionary attacks , 2002, CCS '02.

[52]  Selmer Bringsjord,et al.  Artificial Intelligence and Literary Creativity: Inside the Mind of Brutus, A Storytelling Machine , 1999 .

[53]  O. Roeva,et al.  Information Hiding: Techniques for Steganography and Digital Watermarking , 2000 .

[54]  Sergei Nirenburg,et al.  Natural language processing for information assurance and security: an overview and implementations , 2001, NSPW '00.

[55]  John Langford,et al.  CAPTCHA: Using Hard AI Problems for Security , 2003, EUROCRYPT.

[56]  Mark Chapman,et al.  Plausible Deniability Using Automated Linguistic Stegonagraphy , 2002, InfraSec.

[57]  Bernard P. Zajac Applied cryptography: Protocols, algorithms, and source code in C , 1994 .

[58]  Ken Thompson,et al.  Password security: a case history , 1979, CACM.

[59]  Vitaly Shmatikov,et al.  Fast dictionary attacks on passwords using time-space tradeoff , 2005, CCS '05.

[60]  Mikhail J. Atallah,et al.  The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions , 2006, MM&Sec '06.

[61]  Rainer Böhme,et al.  Statistical characterisation of MP3 encoders for steganalysis , 2004, MM&Sec '04.

[62]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[63]  A. Volchkov Revisiting single sign-on: a pragmatic approach in a new context , 2001 .

[64]  Carl Brown,et al.  Assistive technology computers and persons with disabilities , 1992, CACM.

[65]  Eugene H. Spafford,et al.  Observing Reusable Password Choices , 1992 .

[66]  Julie Thorpe,et al.  Pass-thoughts: authenticating with our minds , 2005, NSPW '05.

[67]  Radu Sion,et al.  Natural Language Watermarking and Tamperproofing , 2002, Information Hiding.

[68]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[69]  Michael K. Reiter,et al.  The Design and Analysis of Graphical Passwords , 1999, USENIX Security Symposium.

[70]  Charles L. A. Clarke,et al.  Frequency Estimates for Statistical Word Similarity Measures , 2003, NAACL.