Enlisting Supervised Machine Learning in Mapping Scientific Uncertainty Expressed in Food Risk Analysis

Recently, both sociology of science and policy research have shown increased interest in scientific uncertainty. To contribute to these debates and create an empirical measure of scientific uncertainty, we inductively devised two systems of classification or ontologies to describe scientific uncertainty in a large corpus of food safety risk assessments with the help of machine learning (ML). We ask three questions: (1) Can we use ML to assist with coding complex documents such as food safety risk assessments on a difficult topic like scientific uncertainty? (2) Can we assess using ML the quality of the ontologies we devised? (3) And, finally, does the quality of our ontologies depend on social factors? We found that ML can do surprisingly well in its simplest form identifying complex meanings, and it does not benefit from adding certain types of complexity to the analysis. Our ML experiments show that in one ontology which is a simple typology, against expectations, semantic opposites attract each other and support the taxonomic structure of the other. And finally, we found some evidence that institutional factors do influence how well our taxonomy of uncertainty performs, but its ability to capture meaning does not vary greatly across the time, institutional context, and cultures we investigated.

[1]  Asunción Gómez-Pérez,et al.  Ontological Engineering: With Examples from the Areas of Knowledge Management, e-Commerce and the Semantic Web , 2004, Advanced Information and Knowledge Processing.

[2]  R L Kodell,et al.  Incorporating model uncertainties along with data uncertainties in microbial risk assessment. , 2000, Regulatory toxicology and pharmacology : RTP.

[3]  Scott Frickel,et al.  Hurricane Katrina, contamination, and the unintended organization of ignorance , 2007 .

[4]  Jan Rotmans,et al.  Uncertainty in Integrated Assessment Modelling , 2002 .

[5]  C. Bail The cultural environment: measuring culture with big data , 2014, Theory and Society.

[6]  Sven Ove Hansson,et al.  Indicators of uncertainty in chemical risk assessments. , 2004, Regulatory toxicology and pharmacology : RTP.

[7]  Maarten Nauta,et al.  Separation of uncertainty and variability in quantitative microbial risk assessment models. , 2000 .

[8]  M. Elisabeth Paté-Cornell,et al.  Uncertainties in risk analysis: Six levels of treatment , 1996 .

[9]  Erik Millstone,et al.  Politics of expert advice: Lessons from the early history of the BSE saga , 2001 .

[10]  C. Bosk,et al.  Forbidden Knowledge: Public Controversy and the Production of Nonknowledge1 , 2011 .

[11]  Karen Kastenhofer,et al.  Scientific Nonknowledge and Its Political Dynamics: The Cases of Agri-Biotechnology and Mobile Phoning , 2010 .

[12]  David B. Dunson,et al.  Probabilistic topic models , 2012, Commun. ACM.

[13]  J. A. Crowther The Evolution of Physics: , 1938, Nature.

[14]  George Lakoff,et al.  Hedges: A study in meaning criteria and the logic of fuzzy concepts , 1973, J. Philos. Log..

[15]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  J. Dorne,et al.  The refinement of uncertainty/safety factors in risk assessment by the incorporation of data on toxicokinetic variability in humans. , 2005, Toxicological sciences : an official journal of the Society of Toxicology.

[17]  T. Curnow Evidentiality and Epistemological Stance: Narrative Retelling (review) , 2003 .

[18]  Julie Boberg,et al.  Health risk assessment of chemical mixtures , 2015 .

[19]  S. Naidu,et al.  Political Language in Economics , 2015 .

[20]  Ronald L. Breiger,et al.  Ontologies, methodologies, and new uses of Big Data in the social and cultural sciences , 2015 .

[21]  Warren E. Walker,et al.  Defining Uncertainty: A Conceptual Basis for Uncertainty Management in Model-Based Decision Support , 2003 .

[22]  Peter Crompton,et al.  Hedging in academic writing: Some theoretical problems , 1997 .

[23]  Gary King,et al.  General purpose computer-assisted clustering and conceptualization , 2011, Proceedings of the National Academy of Sciences.

[24]  Erik Millstone,et al.  BSE: Risk, Science, and Governance , 2005 .

[25]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993 .

[26]  S. Funtowicz,et al.  Combining Quantitative and Qualitative Measures of Uncertainty in Model‐Based Environmental Assessment: The NUSAP System , 2005, Risk analysis : an official publication of the Society for Risk Analysis.

[27]  M. Sebrechts Ignorance and Uncertainty: Emerging Paradigms , 1989 .

[28]  Karen Kastenhofer,et al.  Scientific Cultures of Non-Knowledge in the Controversy over Genetically Modified Organisms (GMO) The Cases of Molecular Biology and Ecology , 2006 .

[29]  J Kleiner,et al.  Assessment of intake from the diet. , 2002, Food and chemical toxicology : an international journal published for the British Industrial Biological Research Association.

[30]  Londa Schiebinger,et al.  Agnotology : the making and unmaking of ignorance , 2008 .

[31]  R. Merton Social Theory and Social Structure , 1958 .

[32]  Linda Shields,et al.  Content Analysis , 2015 .

[33]  Greg Myers,et al.  The pragmatics of politeness in scientific articles , 1989 .

[34]  Esben Budtz-Jørgensen,et al.  Total imprecision of exposure biomarkers: implications for calculating exposure limits. , 2007, American journal of industrial medicine.

[35]  D. Blei,et al.  Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of U.S. government arts funding , 2013 .

[36]  Dustin Hillard,et al.  Computer-Assisted Topic Classification for Mixed-Methods Social Science Research , 2008 .

[37]  M. Aly Survey on Multiclass Classification Methods , 2005 .

[38]  Paul DiMaggio,et al.  Adapting computational text analysis to social science (and vice versa) , 2015, Big Data Soc..

[39]  Mohak Shah,et al.  Evaluating Learning Algorithms: Contents , 2011 .

[40]  Joanne J. Gaudet,et al.  It takes two to tango: knowledge mobilization and ignorance mobilization in science research and innovation , 2013 .

[41]  Max Henrion,et al.  Uncertainty: A Guide to Dealing with Uncertainty in Quantitative Risk and Policy Analysis , 1990 .

[42]  Gary James Jason,et al.  The Logic of Scientific Discovery , 1988 .

[43]  Bodil Nistrup Madsen,et al.  Ontologies vs. classification systems , 2009 .

[44]  Robert K. Merton,et al.  Three Fragments From a Sociologist's Notebooks: Establishing the Phenomenon, Specified Ignorance, and Strategic Research Materials , 1987 .

[45]  Ignacio Vázquez Orta,et al.  Writing with conviction: the use of boosters in modelling persuasion in academic discourses , 2009 .

[46]  Matthew Hayes,et al.  A Progressive Supervised-learning Approach to Generating Rich Civil Strife Data , 2015 .

[47]  Brendon Swedlow,et al.  Precautionary Regulation in Europe and the United States: A Quantitative Comparison , 2005, Risk analysis : an official publication of the Society for Risk Analysis.

[48]  Justin Grimmer,et al.  Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts , 2013, Political Analysis.

[49]  W. Walker,et al.  Defining Uncertainty: A Conceptual Basis for Uncertainty Management in Model-Based Decision Support , 2003 .

[50]  David E. Burmaster,et al.  Assessment of Variability and Uncertainty Distributions for Practical Risk Analyses , 1994 .

[51]  Frøydis Gillund,et al.  Do uncertainty analyses reveal uncertainties? Using the introduction of DNA vaccines to aquaculture as a case. , 2008, The Science of the total environment.

[52]  Xavier Bry,et al.  A “Global Interdependence” Approach to Multidimensional Sequence Analysis , 2015 .

[53]  Robert L. Winkler,et al.  Uncertainty in probabilistic risk assessment , 1996 .

[54]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[55]  Jennifer L. Croissant Agnotology: Ignorance and Absence or Towards a Sociology of Things That Aren’t There , 2014 .

[56]  Matthias Gross,et al.  ‘Objective Culture’ and the Development of Nonknowledge: Georg Simmel and the Reverse Side of Knowing , 2012 .

[57]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[58]  K. Hyland,et al.  Writing Without Conviction? Hedging in Science Research Articles , 1996 .

[59]  Matthias Gross,et al.  The Unknown in Process , 2007 .

[60]  David Vogel,et al.  The Regulation of GMOs in Europe and the United States: A Case-Study of Contemporary , 2013 .