Have You Lost the Thread? Discovering Ongoing Conversations in Scattered Dialog Blocks

Finding threads in textual dialogs is emerging as a need to better organize stored knowledge. We capture this need by introducing the novel task of discovering ongoing conversations in scattered dialog blocks. Our aim in this article is twofold. First, we propose a publicly available testbed for the task by solving the insurmountable problem of privacy of Big Personal Data. In fact, we showed that personal dialogs can be surrogated with theatrical plays. Second, we propose a suite of computationally light learning models that can use syntactic and semantic features. With this suite, we showed that models for this challenging task should include features capturing shifts in language use and, possibly, modeling underlying scripts.

