Natural language processing for information assurance and security: an overview and implementations

This research paper explores a promising interface between natural language processing (NLP) and information assurance and security (IAS). More specificall~ it is devoted to possible applications to, and further dedicated development of, the accumulated considerable resources in NLP for, IAS. The expected and partially accomplished result is in harnessing the weird, illogical ways natural languages encode meaning, the very ways that defy all the usual combinatorial approaches to mathematical--and computational--complexity and make NLP so hard, to enhance information security. The paper is of a mixed theoretical and empirical nature. Of the four possible venues of applications, (i) memorizing randomly generated passwords with the help of automatically generated funny jingles, (ii) natural language watermarking, (iii) using the available machine translation (MT) systems for (additional) encryption of text messages, and (iv) downgrading, or sanitizing classified information in networks, two venues, (i) and (iv), have been at least partially implemented and the remaining two (ii) and (iii) are being implemented to the proof-of-concept level. We must make it very clear, however, that we have done very little experimentation or evaluation at this point, though we are moving quickly in that direction. The merits of the paper, if any, are in its venture to make considerable progress achieved recently in NLE especially in knowledge representation and meaning analysis, useful for IAS needs. The NLP approach adopted here, ontological semantics, has been developed by two of the coauthors; watermarking is based on the pioneering research by another coauthor and his associates; most of the implementation of the password memorization software has been done by the fourth coauthor. All the four of us have agonized whether we should report this research now or wait till we have fully implemented all or at least some of the systems we are developing. At the end of the day, we have reached a consensus that it is important, even at this early stage, to review for the information security community what NLP can do for it and to invite feedback and further efforts and ideas on what seems likely to become a new paradigm in information security. To the body of the paper, we Mikhail J. Atallah, Craig J. McDonough, Victor Raskin Center for Education and Research in Information Assurance and Security (CERIAS, www.cerias.purdue.edu) Purdue University W. Lafayette, IN 47907 mja, raskin, mcdonoug@cerias.purdue.edu Sergei Nirenburg Computing Research Laboratory, New Mexico State University Las Cruces, NM 88003 sergei@crl.nmsu.edu have added two self-contained deliberately reference-free appendices on NLP and ontological semantics, respectively, primarily for the benefit of those IAS readers, who are interested in expanding their understanding of those fields and further exploring their possible fruitful interactions with IAS.

[1]  Victor Raskin,et al.  Semantic mechanisms of humor , 1984 .

[2]  NirenburgSergei,et al.  The subworld concept lexicon and the lexicon management system , 1987 .

[3]  Sergei Nirenburg,et al.  The Subworld Concept Lexicon and the Lexicon Management System , 1987, Comput. Linguistics.

[4]  Eugene H. Spafford,et al.  The COPS Security Checker System , 1990, USENIX Summer.

[5]  Eugene H. Spafford Preventing Weak Password Choices , 1991 .

[6]  Eugene H. Spafford,et al.  Observing Reusable Password Choices , 1992 .

[7]  Eugene H. Spafford,et al.  OPUS: Preventing weak password choices , 1992, Comput. Secur..

[8]  Salvatore Attardo,et al.  Non-literalness and non-bona-fîde in language: An approach to formal and computational treatments of humor , 1994 .

[9]  V. Raskin,et al.  Ten Choices for Lexical Semantics , 1996 .

[10]  Kavi Mahesh,et al.  Ontology Development for Machine Translation: Ideology and Methodology , 1996 .

[11]  Ingemar J. Cox,et al.  Secure spread spectrum watermarking for images, audio and video , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[12]  K. Binsted,et al.  Computational rules for generating punning riddles , 1997 .

[13]  Ingemar J. Cox,et al.  Review of watermarking and the importance of perceptual modeling , 1997, Electronic Imaging.

[14]  Sergei Nirenburg,et al.  Choices for Lexical Semantics , 2001, Comput. Intell..

[15]  Steven J. Templeton,et al.  A requires/provides model for computer attacks , 2001, NSPW '00.

[16]  V. Raskin,et al.  Universal Grammar and Lexis for Quick Ramp-Up of MT Systems , COLING.