Identifying and characterizing public science-related fears from RSS feeds

A feature of modern democracies is public mistrust of scientists and the politicization of science policy, e.g., concerning stem cell research and genetically modified food. While the extent of this mistrust is debatable, its political influence is tangible. Hence, science policy researchers and science policy makers need early warning of issues that resonate with a wide public so that they can make timely and informed decisions. In this article, a semi-automatic method for identifying significant public science-related concerns from a corpus of Internet-based RSS (Really Simple Syndication) feeds is described and shown to be an improvement on a previous similar system because of the introduction of feed-based aggregation. In addition, both the RSS corpus and the concept of public science-related fears are deconstructed, revealing hidden complexity. This article also provides evidence that genetically modified organisms and stem cell research were the two major policy-relevant science concern issues, although mobile phone radiation and software security also generated significant interest.

[1]  Mike Thelwall,et al.  Web issue analysis: An integrated water resource management case study , 2006, J. Assoc. Inf. Sci. Technol..

[2]  Lada A. Adamic,et al.  Power-Law Distribution of the World Wide Web , 2000, Science.

[3]  S. Fuchs,et al.  The Professional Quest for Truth: A Social Theory of Science and Knowledge , 1992 .

[4]  Joyce Tait,et al.  More Faust than Frankenstein: the European debate about the precautionary principle and risk regulation for genetically modified crops , 2001 .

[5]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, WWW '04.

[6]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[7]  Steve Smith Tapping the feed: In search of an RSS money trail , 2005 .

[8]  Shu-Hsiang Hsu,et al.  Advocacy coalitions and policy change on nuclear power utilization in Taiwan , 2005 .

[9]  David M. Pennock,et al.  Winners don't take all: Characterizing the competition for links on the web , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Ben Hammersley,et al.  Developing Feeds With RSS And Atom , 2005 .

[11]  R. Rousseau Sitations: an exploratory study , 1997 .

[12]  Ravi Kumar,et al.  Structure and evolution of blogspace , 2004, CACM.

[13]  Loet Leydesdorff,et al.  A Triple Helix of University—Industry—Government Relations , 1998, Scientometrics.

[14]  R. Merton The Matthew Effect in Science , 1968, Science.

[15]  Stefaan Walgrave,et al.  New media, new movements? The role of the internet in shaping the ‘anti‐globalization’ movement , 2002 .

[16]  Eric K. Ringger,et al.  Pulse: Mining Customer Opinions from Free Text , 2005, IDA.

[17]  Arlene Judith Klotzko,et al.  A clone of your own? : the science and ethics of cloning , 2004 .

[18]  Rebecca Blood,et al.  How blogging software reshapes the online community , 2004, CACM.

[19]  Jean Seaton,et al.  Carnage and the media: the making and breaking of news about violence , 2005 .

[20]  Graham Kalton,et al.  Introduction to Survey Sampling , 1983 .

[21]  Lewis Wolpert,et al.  The Medawar Lecture 1998 Is science dangerous? , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[22]  Rudy Prabowo,et al.  Are raw RSS feeds suitable for broad issue scanning? A science concern case study , 2006 .

[23]  Alex John London,et al.  Undue Inducements and Reasonable Risks: Will the Dismal Science Lead to Dismal Research Ethics? , 2005, The American journal of bioethics : AJOB.

[24]  Donald Matheson,et al.  Weblogs and the Epistemology of the News: Some Trends in Online Journalism , 2004, New Media Soc..

[25]  Tony Hammond,et al.  The Role of RSS in Science Publishing: Syndication and Annotation on the Web , 2004, D Lib Mag..

[26]  Mike Thelwall,et al.  The clustering power of low frequency words in academic Webs , 2005, J. Assoc. Inf. Sci. Technol..

[27]  David A. Smith,et al.  Detecting and Browsing Events in Unstructured text , 2002, SIGIR '02.

[28]  Judit Bar-Ilan,et al.  Information hub blogs , 2005, J. Inf. Sci..

[29]  Ruth Chadwick,et al.  Professional ethics and the 'good' of science , 2005 .

[30]  Chih-Ping Wei,et al.  Event detection from online news documents for supporting environmental scanning , 2004, Decis. Support Syst..

[31]  Stefaan Walgrave,et al.  New media, new movements? The role of the internet in shaping the "antiglobalization movement" , 2004 .

[32]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[33]  Power-Law Distribution of the World Wide Web , 2000, Science.

[34]  David A. Huffaker,et al.  Gender, Identity, and Language Use in Teenage Blogs , 2006, J. Comput. Mediat. Commun..

[35]  Fernanda B. Viégas,et al.  Bloggers' Expectations of Privacy and Accountability: An Initial Survey , 2006, J. Comput. Mediat. Commun..

[36]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[37]  G. Zipf,et al.  Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. , 1949 .

[38]  Judit Bar-Ilan An outsider's view on "topic-oriented blogging" , 2004, WWW Alt. '04.

[39]  Loet Leydesdorff,et al.  Measuring the meaning of words in contexts: An automated analysis of controversies about 'Monarch butterflies,' 'Frankenfoods,' and 'stem cells' , 2006, Scientometrics.

[40]  Ravi Kumar,et al.  On the Bursty Evolution of Blogspace , 2003, WWW '03.

[41]  D F-C Tsai,et al.  Human embryonic stem cell research debates: a Confucian argument , 2005, Journal of Medical Ethics.

[42]  Lin Jia,et al.  Mapping the Blogosphere in America , 2004, WWW 2004.

[43]  Iina Hellsten,et al.  Focus On Metaphors: The Case Of "Frankenfood" On The Web , 2006, J. Comput. Mediat. Commun..

[44]  Frank R. Baumgartner,et al.  Punctuated equilibrium theory and environmental policy , 2006 .

[45]  Noam Chomsky,et al.  Manufacturing Consent: The Political Economy of the Mass Media , 1988 .

[46]  M. Thelwall,et al.  A comparison of feature selection methods for an evolving RSS feed corpus , 2006, Inf. Process. Manag..

[47]  Matthew Hurst,et al.  BlogPulse: Automated Trend Discovery for Weblogs , 2003 .

[48]  Eytan Adar,et al.  Implicit Structure and the Dynamics of Blogspace , 2004 .

[49]  Robert M. Entman,et al.  Framing: Toward Clarification of a Fractured Paradigm , 1993 .

[50]  Horace Herring From Energy Dreams to Nuclear Nightmares: Lessons from the Anti-nuclear Power Movement in the 1970s , 2006 .

[51]  Michael W. Berry,et al.  Survey of Text Mining: Clustering, Classification, and Retrieval , 2007 .

[52]  Mike Thelwall,et al.  Search engine coverage bias: evidence and possible causes , 2004, Inf. Process. Manag..

[53]  Toyoaki Nishida,et al.  Analyzing concerns of people using Weblog articles and real world temporal data , 2005 .

[54]  Henry Etzkowitz,et al.  Can ‘the public’ be considered as a fourth helix in university-industry-government relations? Report on the Fourth Triple Helix Conference, 2002 , 2003 .

[55]  Carol M Musil,et al.  How to determine whether a convenience sample represents the population. , 2004, Applied nursing research : ANR.

[56]  Michael Pinsky Future Present: Ethics And/As Science Fiction , 2003 .

[57]  A. Young Sorting Things Out: Classification and Its Consequences. , 2001 .

[58]  James Allan,et al.  Automatic generation of overview timelines , 2000, SIGIR '00.

[59]  Ron Miller Ebooks worm their way into the reference market , 2005 .

[60]  Hans-Jürgen Bucher,et al.  Crisis Communication and the Internet: Risk and Trust in a Global Media , 2002, First Monday.

[61]  Charles L. Wayne Topic Detection & Tracking ( TDT ) Overview & Perspective , 1998 .

[62]  Eric W. Gill Rss and the information landscape: A look at online news , 2005 .

[63]  Loet Leydesdorff,et al.  Metaphors and Diaphors in Science Communication , 2005 .

[64]  Manabu Okumura,et al.  Differences between Blogs and Web Diaries , 2005 .

[65]  Piotr S. Szczepaniak,et al.  Classification of RSS-Formatted Documents Using Full Text Similarity Measures , 2005, ICWE.

[66]  Blaise Cronin,et al.  Vox populi: Civility in the blogosphere , 2005, International Journal of Information Management.

[67]  A. Giddens The consequences of modernity , 1990 .