Informing the Curious Negotiator: Automatic News Extraction from the Internet

Information acquisition and validation play an important role in the decision making process during negotiation. In this chapter we briefly present the framework of a smart data mining system for providing contextual information extracted from the Internet to a negotiation agent. We then present one of its components in more details – an effective automated technique for extracting relevant articles from news web sites, so that they can be used further by the mining agents. Most current techniques experience difficulties in coping with changes in web site structure and formats. The proposed extraction process is completely automatic and independent of web site formats. Proposed technique identifies regularities in both format and content of news web sites. The algorithms are applicable to both single- and multi-document web sites. Since invalid URLs can cause errors in data extraction, we also present a method for the negotiation agent to estimate the validity of the extracted data based on the frequency of the relevant words in the news title. Once the news articles are extracted the next task is to construct sets of given articles. This chapter presents a new procedure for constructing news data sets on given topics. The extracted news data set is further utilised by the parties involved in negotiation. The information retrieved from the data set can support both human and automated negotiators.

[1]  Anette Hulth,et al.  Automatic Keyword Extraction Using Domain Knowledge , 2001, CICLing.

[2]  Paul R. Milgrom,et al.  A theory of auctions and competitive bidding , 1982 .

[3]  Nicholas Kushmerick,et al.  The Wrapper Induction Environment , 1998 .

[4]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[5]  M. de Rijke,et al.  Automatic Wrapper Generation for Web Search Engines , 2000, Web-Age Information Management.

[6]  Jan-Ming Ho,et al.  Discovering informative content blocks from Web documents , 2002, KDD.

[7]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[8]  Simon Parsons,et al.  Principles of Data Mining by David J. Hand, Heikki Mannila and Padhraic Smyth, MIT Press, 546 pp., £34.50, ISBN 0-262-08290-X , 2004, The Knowledge Engineering Review.

[9]  Nicholas Kushmerick,et al.  Wrapper induction: Efficiency and expressiveness , 2000, Artif. Intell..

[10]  John K. Debenham,et al.  Curious Negotiator , 2002, CIA.

[11]  P. Jehiel,et al.  Dynamic Processes of Social and Economic Interactions: On the Persistence of Inefficiencies , 2001, Journal of Political Economy.

[12]  Sarit Kraus,et al.  Strategic Negotiation in Multiagent Environments , 2001, Intelligent robots and autonomous agents.

[13]  Leon Sterling,et al.  Semi-structured data extraction from heterogenous sources , 2000 .

[14]  Michael D. Watkins,et al.  Breakthrough Business Negotiation: A Toolbox for Managers , 2002 .

[15]  Kathleen R. McKeown,et al.  Columbia multi-document summarization : Approach and evaluation , 2001 .

[16]  Michael Ströbel,et al.  Design of Roles and Protocols for Electronic Negotiations , 2001, Electron. Commer. Res..

[17]  Felix A. Fischer,et al.  Cooperative Information Agents XI , 2008 .

[18]  Enrico Gerding,et al.  Multi-Issue Negotiation Processes by Evolutionary Simulation, Validation and Social Extensions , 2003 .

[19]  Miguel A. Andrade-Navarro,et al.  Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families , 1998, Bioinform..

[20]  Craig A. Knoblock,et al.  STALKER: Learning Extraction Rules for Semistructured, Web-based Information Sources * , 1998 .

[21]  Dayne Freitag,et al.  Boosted Wrapper Induction , 2000, AAAI/IAAI.