The FACTS of Technology-Assisted Sensitivity Review

At least ninety countries implement Freedom of Information laws that state that government documents must be made freely available, or opened, to the public. However, many government documents contain sensitive information, such as personal or confidential information. Therefore, all government documents that are opened to the public must first be reviewed to identify, and protect, any sensitive information. Historically, sensitivity review has been a completely manual process. However, with the adoption of born-digital documents, such as e-mail, human-only sensitivity review is not practical and there is a need for new technologies to assist human sensitivity reviewers. In this paper, we discuss how issues of fairness, accountability, confidentiality, transparency and safety (FACTS) impact technology-assisted sensitivity review. Moreover, we outline some important areas of future FACTS research that will need to be addressed within technology-assisted sensitivity review.

[1]  Craig MacDonald,et al.  Active Learning Strategies for Technology Assisted Sensitivity Review , 2018, ECIR.

[2]  L. Sweeney Replacing personally-identifying information in medical records, the Scrub system. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[3]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[4]  Li Xiong,et al.  HIDE: An Integrated System for Health Information DE-identification , 2008, 2008 21st IEEE International Symposium on Computer-Based Medical Systems.

[5]  Alistair Tough The Scope and Appetite for Technology-Assisted Sensitivity Reviewing of Born-Digital Records in a Resource Poor Environment: A Case Study From Malawi , 2018 .

[6]  Peter Szolovits,et al.  Automated de-identification of free-text medical records , 2008, BMC Medical Informatics Decis. Mak..

[7]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[8]  J. Gilbertson,et al.  Evaluation of a deidentification (De-Id) software engine to share pathology reports and clinical documents for research. , 2004, American journal of clinical pathology.

[9]  Craig MacDonald,et al.  Towards a Classifier for Digital Sensitivity Review , 2014, ECIR.

[10]  Emine Yilmaz,et al.  Research Frontiers in Information Retrieval Report from the Third Strategic Workshop on Information Retrieval in Lorne (SWIRL 2018) , 2018 .

[11]  Peter Szolovits,et al.  A de-identifier for medical discharge summaries , 2008, Artif. Intell. Medicine.

[12]  Úlfar Erlingsson,et al.  The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets , 2018, ArXiv.

[13]  Maura R. Grossman,et al.  Evaluation of machine-learning protocols for technology-assisted review in electronic discovery , 2014, SIGIR.

[14]  Burr Settles,et al.  Active Learning , 2012, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[15]  Craig MacDonald,et al.  How Sensitivity Classification Effectiveness Impacts Reviewers in Technology-Assisted Sensitivity Review , 2019, CHIIR.

[16]  Rayid Ghani,et al.  A Machine Learning Based System for Semi-Automatically Redacting Documents , 2011, IAAI.

[17]  Franck Dernoncourt,et al.  De-identification of patient notes with recurrent neural networks , 2016, J. Am. Medical Informatics Assoc..

[18]  Douglas W. Oard,et al.  Evaluation of information retrieval for E-discovery , 2010, Artificial Intelligence and Law.

[19]  M. Hepple,et al.  Identifying Personal Health Information Using Support Vector Machines , 2006 .

[20]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[21]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[22]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[23]  Guillermo Navarro-Arribas,et al.  On the Declassification of Confidential Documents , 2011, MDAI.

[24]  Roy Peled,et al.  The Constitutional Right to Information , 2010 .