Briefly Noted

Over the last decade or so there has been growing interest in research on computationally analyzing opinions, feelings, and subjective evaluation in text. This burgeoning body of work, variously called “sentiment analysis,” “opinion mining,” and “subjectivity analysis,” addresses such problems as distinguishing objective from subjective propositions, characterizing positive and negative evaluations, determining the sources of different opinions expressed in a document, and summarizing writers’ judgments over a large corpus of texts. Potential applications include Web mining for consumer and political opinion summarization, business and government intelligence analysis, and improving text analysis applications such as information retrieval, question answering, and text summarization. In this well-written book, Pang and Lee survey the current state of the art in opinion mining and sentiment analysis, broadly construed, with the goal of fitting this diverse research area into a unified framework. After a brief introduction to the area (Chapter 1) and survey of application areas (Chapter 2), the authors present their view of the central challenges that unify this research area in Chapter 3, largely by contrasting it with “traditional,” “fact-based” text analysis. The book then surveys the full range of extant approaches, dividing them into sentiment classification and extraction (Chapter 4), and opinion summarization (Chapter 5). This survey is quite thorough as regards computational work in the area, though it lacks detailed reference to relevant linguistics research such as in the study of modality (Nuyts 2001; Kärkkäinen 2003), cognitive linguistics (Stein and Wright 1995; Langacker 2002), and appraisal theory (Martin and White 2005). This lacuna is justified, however, by the (perhaps unfortunate) fact that little computational work to date relates to this literature. A distinctive and valuable feature of the book is the inclusion of material on the relationship between subjective language and its social and economic impact (Chapter 6). This discussion helps to place the technical work in its larger context, pointing towards opportunities and risks in its application in various situations. Also particularly valuable is Chapter 7, on publicly available resources, which includes much useful information about available data sets, relevant competitive evaluations, and tutorials/bibliographies in the area. Although much of this information is likely to become outdated, the authors also maintain a companion Web site which presumably will feature updates to this resource list. The book provides a useful resource for application developers as well as for researchers, though some readers might have benefited from a more extensive discussion of real-world applications and how various techniques can be used as components of larger systems. Overall, this slim and entertaining volume is an excellent and timely survey of an exciting and growing research area within computational linguistics.—Shlomo Argamon, Illinois Institute of Technology