Sequence Package Analysis and Soft Computing: Introducing a New Hybrid Method to Adjust to the Fluid and Dynamic Nature of Human Speech

At Linguistic Technology Systems, we are using Sequence Package Analysis (SPA) to architect a new, pragmatically-based part of speech tagging program to better conform to the fluidity and dynamism of human speech. This would allow natural language-driven voice user interfaces and audio mining programs – for use in both commercial and government applications – to adapt to the in situ construction of dialog, marked by the imprecision, ambiguity and vagueness extant in real-world communications. While conventional part of speech (POS) tagging programs consist of parsing structures derived from syntactic (and semantic) analysis, speech system developers (and users) are also very much aware of the fact that speech recognition difficulties still plague such conventional spoken dialog systems. This is because the inherent inexactitude, vagueness, and uncertainty that are inextricable to the dynamic and fluid nature of human dialog in the real world (e.g., a sudden accretion of anger/frustration may transform a simple question into a rhetorical one; or transform an otherwise simple and straightforward assessment into a gratuitous/sardonic remark) cannot be adequately addressed by conventional POS tagging programs based on syntactic and/or semantic analysis. If we consider for a moment that the biological organism of the human mind does not appear (for the most part) to have much difficulty following the vagarious ebb and flow of dialog with remarkable accuracy and comprehension, so that business transactions and social acts are consummated with a fair amount of regularity and predictability in our quotidian lives, why can’t we design spoken dialog systems to emulate the human mind? To do this, we must first uncover the special formulae that humans regularly invoke to understand humanto- human dialog which by virtue of its fluid and dynamic constitution is often punctuated by ambiguities, obscurities, repetitions, ellipses, and deixes (indirect referents) – the same stubborn and ineluctable features of natural language which individually and collectively impede the performance of speech systems. Using a unique set of parsing structures – consisting of context-free grammatical units, with notations for related prosodic features – to capture the fluid/dynamic nature of human speech, SPA meets the goal of soft computing to exploit the tolerance for imprecision, uncertainty, obscurity, and approximation in order to achieve tractability, robustness and low solution cost. And as a hybrid method – uniquely combining conversation analysis with computational linguistics – SPA is complementary to artificial neural networks and fuzzy logic because in building a flexible and adaptable natural language speech interface, neural networks, or connectionist models, may be viewed as the natural choice for investigating the patterns underlying the orderliness of talk, as they are equipped to handle the ambiguities of natural language due to their capacity, when confronted with incomplete or somewhat conflicting information, to produce a fuzzy set.

[1]  G. Psathas Everyday language : studies in ethnomethodology , 1981 .

[2]  Graeme Hirst,et al.  Does Conversation Analysis Have a Role in Computational Linguistics? , 1991, CL.

[3]  Emanuel A. Schegloff,et al.  To Searle on Conversation: A Note in Return , 1992 .

[4]  Amy Neustein Sequence Package Analysis: A New Method for Intelligent Mining of Patient Dialog, Blogs and Help-line Calls , 2007, J. Comput..

[5]  Amy Neustein Advances in Speech Recognition: Mobile Environments, Call Centers and Clinics , 2010 .

[6]  P. Thomas The social and interactional dimensions of human-computer interfaces , 1995 .

[7]  Harvey Sacks,et al.  Lectures on Conversation , 1995 .

[8]  Robin Wooffitt,et al.  Conversation Analysis: Principles, Practices and Applications , 1998 .

[9]  David Frohlich,et al.  Computers and conversation , 1990 .

[10]  Amy Neustein,et al.  Sequence Package Analysis: a new natural language understanding method for improving human response in critical systems , 2006, Int. J. Speech Technol..

[11]  Robin Wooffitt,et al.  Organising Computer Talk , 1990 .

[12]  Amy Neustein,et al.  Using Sequence Package Analysis to Improve Natural Language Understanding , 2001, Int. J. Speech Technol..

[13]  C. D. Claiborn,et al.  Talk and Social Organization. , 1988 .

[14]  Wes Sharrock,et al.  Computers, Minds and Conduct , 1995 .

[15]  Graham Button,et al.  Going Up a Blind Alley: Conflating Conversation Analysis and Computational Modelling , 1990 .

[16]  Tim Polzehl,et al.  “For Heaven’s Sake, Gimme a Live Person!” Designing Emotion-Detection Customer Care Voice Applications in Automated Call Centers , 2010 .

[17]  Shrikanth S. Narayanan,et al.  Toward detecting emotions in spoken dialogs , 2005, IEEE Transactions on Speech and Audio Processing.

[18]  G. Jefferson,et al.  The rejection of advice: Managing the problematic convergence of a ‘troubles-telling’ and a ‘service encounter’ , 1981 .

[19]  Wes Sharrock,et al.  On simulacrums of conversation: toward a clarification of the relevance of conversation analysis for human-computer interaction , 1995 .