Using Natural Language Processing to Assess Text Usefulness to Readers: The Case of Conference Calls and Earnings Prediction

We examine whether support vector regressions (SVR), supervised LDA (sLDA), random forest regression trees (RF), and ‘tone’ extract narrative content from conference calls that correlates with useful information that a human reader would identify. We find that each narrative-content measure (along with a composite measure) explains a portion of analyst-forecast revisions for quarter q 1 issued after the conference call in quarter q. Correlation with analyst-forecast revisions improves when the composite measure adapts to context (positive/negative returns; high variance/low variance) and ignores sparse words. The correlation is comparable and incremental to that of financial signals (cash-flow changes, earnings surprises, and management forecasts), which suggests that the narrative content of conference calls as extracted by readers is economically significant. Our results suggest that models of narrative content have reasonable construct validity and that this validity is likely to be improved by further thought on the unique characteristics of text.

[1]  Paula J. Pomerenke Book Reviews : A Plain English Handbook: How To Create Clear SEC Disclosure Documents , 1999 .

[2]  Joshua A. Lee,et al.  Can Investors Detect Managers’ Lack of Spontaneity? Adherence to Pre-determined Scripts during Earnings Conference Calls , 2014 .

[3]  Jonathan L. Rogers,et al.  Disclosure Tone and Shareholder Litigation , 2011 .

[4]  Anindya Datta,et al.  Simultaneously Discovering and Quantifying Risk Types from Textual Risk Disclosures , 2014, Manag. Sci..

[5]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[6]  Hsinchun Chen,et al.  The information content of mandatory risk factor disclosures in corporate filings , 2010 .

[7]  David R. Peterson,et al.  Earnings Conference Calls and Stock Returns: The Incremental Informativeness of Textual Tone , 2011 .

[8]  Jeremy Piger,et al.  Beyond the Numbers: Measuring the Information Content of Earnings Press Release Language , 2011 .

[9]  Feng Li Annual Report Readability, Current Earnings, and Earnings Persistence , 2008 .

[10]  Noah A. Smith,et al.  Predicting Risk from Financial Reports with Regression , 2009, NAACL.

[11]  Mark H. Lang,et al.  The Evolution of 10-K Textual Disclosure: Evidence from Latent Dirichlet Allocation , 2017 .

[12]  Tim Loughran,et al.  When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks , 2010 .

[13]  Todd D. Kravet,et al.  Textual risk disclosures and investors’ risk perceptions , 2013 .

[14]  Stephen P. Ryan,et al.  Machine Learning Methods for Demand Estimation , 2015 .

[15]  Kristin M. Ferguson,et al.  The Girl Child , 2009 .

[16]  Feng Li The Information Content of Forward-Looking Statements in Corporate Filings—A Naïve Bayesian Machine Learning Approach , 2010 .

[17]  Kristian D. Allee,et al.  The Structure of Voluntary Disclosure Narratives: Evidence from Tone Dispersion , 2014 .

[18]  Q. Vuong Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses , 1989 .

[19]  Andrew J. Leone,et al.  Measuring Qualitative Information in Capital Markets Research: Comparison of Alternative Methodologies to Measure Disclosure Tone , 2016 .

[20]  木村 和夫 Pragmatics , 1997, Language Teaching.

[21]  Sofus A. Macskassy,et al.  More than Words: Quantifying Language to Measure Firms' Fundamentals the Authors Are Grateful for Assiduous Research Assistance from Jie Cao and Shuming Liu. We Appreciate Helpful Comments From , 2007 .

[22]  Ronen Feldman,et al.  Management's Tone Change, Post Earnings Announcement Drift and Accruals , 2009 .

[23]  Graham Stevens,et al.  What is Meaning? , 2011 .

[24]  Mohan Venkatachalam,et al.  The Power of Voice: Managerial Affective States and Future Firm Performance , 2011 .

[25]  Marilyn F. Johnson,et al.  An Empirical Examination of Conference Calls as a Voluntary Disclosure Medium , 1997 .

[26]  Richard Frankel,et al.  Using Unstructured and Qualitative Disclosures to Explain Accruals , 2015 .

[27]  Jim Whalen,et al.  A Measure of Competition Based on 10-K Filings , 2012 .

[28]  Andrew J. Leone,et al.  A Plain English Measure of Financial Reporting Readability , 2017 .

[29]  Kent Johnson,et al.  An Overview of Lexical Semantics , 2007 .

[30]  Scott A. Richardson,et al.  The Walk-down to Beatable Analyst Forecasts: The Role of Equity Issuance and Insider Trading Incentives* , 2004 .

[31]  Bill McDonald,et al.  Measuring Readability in Financial Disclosures , 2013 .

[32]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[33]  Dawn Matsumoto,et al.  What Makes Conference Calls Useful? The Information Content of Managers' Presentations and Analysts' Discussion Sessions , 2011 .

[34]  D. Larcker,et al.  Detecting Deceptive Discussions in Conference Calls , 2012 .

[35]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[36]  Alan Moreira,et al.  News Implied Volatility and Disaster Concerns , 2015 .

[37]  Elizabeth Demers,et al.  Soft information in earnings announcements: news or noise? , 2008 .