PolicyCLOUD: A prototype of a cloud serverless ecosystem for policy analytics

We present PolicyCLOUD, a prototype for an extensible, serverless cloud-based system that supports evidence-based elaboration and analysis of policies. PolicyCLOUD allows flexible exploitation and management of policy-relevant dataflows by enabling the practitioner to register datasets and specify a sequence of transformations and/or information extraction through registered ingest functions. Once a possibly transformed dataset has been ingested, additional insights can be retrieved by further applying registered analytic functions. PolicyCLOUD was built as an extensible framework toward the creation of an analytic ecosystem. As of now, we developed several essential ingest and analytic functions that are built-in within the framework. They include data cleaning, enhanced interoperability, and sentiment analysis generic functions. PolicyCLOUD has also the ability to tap on the analytic capabilities of external tools. We demonstrate this with a Social Analytics tool implemented in conjunction with PolicyCLOUD and show how to benefit from policy modeling, design and simulation capabilities. Furthermore, PolicyCLOUD has developed a first of its kind legal and ethical framework that covers the usage and dissemination of datasets and analytic functions throughout its policy-relevant dataflows. The article describes and evaluates the application of PolicyCLOUD to four families of pilots that cover a wide range of policy scenarios.

[1]  Luyao Huang,et al.  Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence , 2019, NAACL.

[2]  Dietrich Rebholz-Schuhmann,et al.  Assessing FAIR Data Principles Against the 5-Star Open Data Principles , 2018, ESWC.

[3]  Yannis Charalabidis,et al.  A framework for evidence based policy making combining big data, dynamic modelling and machine intelligence , 2018, ICEGOV.

[4]  Dimosthenis Kyriazis,et al.  PolicyCLOUD: Analytics as a Service Facilitating Efficient Data-Driven Public Policy Management , 2020, AIAI.

[5]  M. Boumans,et al.  Introduction: Experts and Consensus in Social Science , 2014 .

[6]  Jennifer Widom,et al.  Swoosh: a generic approach to entity resolution , 2008, The VLDB Journal.

[7]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[8]  Dimosthenis Kyriazis,et al.  Adjustable Data Cleaning Towards Extracting Statistical Information , 2021, MIE.

[9]  Thanaruk Theeramunkong,et al.  Improved Term Weighting Factors for Keyword Extraction in Hierarchical Category Structure and Thai Text Classification , 2017, Advances in Intelligent Systems and Computing.

[10]  Deepak Padmanabhan,et al.  Multi-entity sentiment analysis using entity-level feature extraction and word embeddings approach , 2017, RANLP.

[11]  Dimosthenis Kyriazis,et al.  Aggregating the syntactic and semantic similarity of healthcare data towards their transformation to HL7 FHIR through ontology matching , 2019, Int. J. Medical Informatics.

[12]  Ahmed Eldawy,et al.  NADEEF: a commodity data cleaning system , 2013, SIGMOD '13.

[13]  Mark Mosley,et al.  DAMA guide to the data management body of knowledge , 2010 .

[14]  Bruce Edmonds,et al.  Some Pitfalls to Beware When Applying Models to Issues of Policy Relevance , 2017, Simulating Social Complexity.

[15]  Paul T. Jaeger,et al.  The impact of polices on government social media usage: Issues, challenges, and recommendations , 2012, Gov. Inf. Q..

[16]  Michal Konkol,et al.  Named Entity Recognition , 2012 .

[17]  Roberto Puccinelli,et al.  Extracting Value from Grey Literature: processes and technologies for aggregating and analyzing the hidden “big data” treasure of organizations , 2016 .

[18]  Lin Li,et al.  A BERT based Sentiment Analysis and Key Entity Detection Approach for Online Financial Texts , 2020, 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[19]  Marc'Aurelio Ranzato,et al.  Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews , 2014, ICLR.

[20]  Ronen Feldman,et al.  Techniques and applications for sentiment analysis , 2013, CACM.

[21]  N. Cornell The Politics of Policy Analysis , 1979 .

[22]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[23]  Muhammad Arshad,et al.  Improve Data Warehouse Performance by Preprocessing and Avoidance of Complex Resource Intensive Calculations , 2012 .

[24]  William N. Dunn,et al.  Public Policy Analysis: An Integrated Approach , 2017 .

[25]  Marko Vukolic,et al.  Bleach: A Distributed Stream Data Cleaning System , 2017, 2017 IEEE International Congress on Big Data (BigData Congress).

[26]  Peter Parycek,et al.  Big data in the policy cycle: Policy decision making in the digital era , 2016, J. Organ. Comput. Electron. Commer..

[27]  Stephen G. Pulman,et al.  Multi-entity Sentiment Scoring , 2009, RANLP.

[28]  Fusheng Wang,et al.  Effective Information Extraction Framework for Heterogeneous Clinical Reports Using Online Machine Learning and Controlled Vocabularies , 2017, JMIR medical informatics.

[29]  Kunio Uchiyama,et al.  Society 5.0: For Human Security and Well-Being , 2018, Computer.

[30]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[31]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[32]  Dimosthenis Kyriazis,et al.  SemAI: A Novel Approach for Achieving Enhanced Semantic Interoperability in Public Policies , 2021, AIAI.

[33]  Henda Hajjami Ben Ghézala,et al.  Sentiment Analysis Approaches based on Granularity Levels , 2018, WEBIST.

[34]  R Nedunchezhian,et al.  Evaluation of three Simple Imputation Methods for Enhancing Preprocessing of Data with Missing Values , 2011 .

[35]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.