Characterizing the Response Space of Questions: a Corpus Study for English and Polish

The main aim of this paper is to provide a characterization of the response space for questions using a taxonomy grounded in a dialogical formal semantics. As a starting point we take the typology for responses in the form of questions provided in (Lupkowski and Ginzburg, 2016). This work develops a wide coverage taxonomy for question/question sequences observable in corpora including the BNC, CHILDES, and BEE, as well as formal modelling of all the postulated classes. Our aim is to extend this work to cover all responses to questions. We present the extended typology of responses to questions based on a corpus studies of BNC, BEE and Maptask with include 506, 262, and 467 question/response pairs respectively. We compare the data for English with data from Polish using the Spokes corpus (205 question/response pairs). We discuss annotation reliability and disagreement analysis. We sketch how each class can be formalized using a dialogical semantics appropriate for dialogue management.

[1]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[2]  Klaus Krippendorff,et al.  Agreement and Information in the Reliability of Coding , 2011 .

[3]  Alex Lascarides,et al.  Logics of Conversation , 2005, Studies in natural language processing.

[4]  Piotr Pęzik,et al.  Spokes - a search and exploration service for conversational corpus data , 2015 .

[5]  Jonathan Ginzburg,et al.  A corpus-based taxonomy of question responses , 2013, IWCS.

[6]  Robin Cooper,et al.  Clarification, Ellipsis, and the Nature of Contextual Updates in Dialogue , 2004 .

[7]  Staffan Larsson,et al.  Towards KoS/TTR-based proof-theoretic dialogue management , 2018 .

[8]  Jonathan Ginzburg,et al.  Interrogative Investigations: The Form, Meaning, and Use of English Interrogatives , 2001 .

[9]  Brian MacWhinney,et al.  The CHILDES Project: Tools for Analyzing Talk (third edition): Volume I: Transcription format and programs, Volume II: The database , 2000, Computational Linguistics.

[10]  T. Stivers,et al.  A coding scheme for question-response sequences in conversation , 2010 .

[11]  Jonathan Ginzburg,et al.  Query responses , 2016, J. Lang. Model..

[12]  B. MacWhinney The CHILDES project: tools for analyzing talk , 1992 .

[13]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[14]  Pierre Lison,et al.  An Active Learning Approach to the Classification of Non-Sentential Utterances , 2015 .

[15]  T. Gonen,et al.  Questions , 1927, Journal of Family Planning and Reproductive Health Care.

[16]  Kyung-Eun Yoon Questions and responses in Korean conversation , 2010 .

[17]  Theo A. F. Kuipers,et al.  An erotetic approach to explanation by specification , 1994 .

[18]  Jonathan Ginzburg,et al.  The interactive stance : meaning for conversation , 2012 .

[19]  Edgar Onea,et al.  Potential Questions at the Semantics-Pragmatics Interface , 2016 .

[20]  Massimo Poesio,et al.  Prolegomena to a theory of ) Completions , Continuations , and Coordination in Dialogue , 2005 .

[21]  Jacqueline C. Kowtko,et al.  Data Collection and Analysis in the Air Travel Planning Domain , 1989, HLT.

[22]  S. Levinson,et al.  Question-response sequences in conversation across ten languages: An introduction , 2010 .

[23]  T. Stivers,et al.  An overview of the question-response system in American English conversation , 2010 .

[24]  A. Graesser,et al.  Mechanisms that generate questions , 1992 .

[25]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[26]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[27]  Andrzej Gajda,et al.  Erotetic Reasoning Corpus. A data set for research on natural question processing , 2018, J. Lang. Model..

[28]  Jonathan Ginzburg,et al.  Classifying Ellipsis in Dialogue: A Machine Learning Approach , 2004, COLING.

[29]  C. Garvey,et al.  Relevant replies to questions: Answers versus evasions , 1981 .

[30]  Bob Carpenter,et al.  The Benefits of a Model of Annotation , 2013, Transactions of the Association for Computational Linguistics.

[31]  A. Lascarides,et al.  Questions in Dialogue , 1998 .

[32]  Matthew Purver,et al.  CLARIE: Handling Clarification Requests in a Dialogue System , 2006 .