CITIESData: a smart city data management framework

Smart city data come from heterogeneous sources including various types of the Internet of Things such as traffic, weather, pollution, noise, and portable devices. They are characterized with diverse quality issues and with different types of sensitive information. This makes data processing and publishing challenging. In this paper, we propose a framework to streamline smart city data management, including data collection, cleansing, anonymization, and publishing. The paper classifies smart city data in sensitive, quasi-sensitive, and open/public levels and then suggests different strategies to process and publish the data within these categories. The paper evaluates the framework using a real-world smart city data set, and the results verify its effectiveness and efficiency. The framework can be a generic solution to manage smart city data.

[1]  Richard Y. Wang,et al.  Data Quality Assessment , 2002 .

[2]  Erhard Rahm,et al.  Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..

[3]  Xiufeng Liu,et al.  A Hybrid ICT-Solution for Smart Meter Data Analytics , 2016, ArXiv.

[4]  Ninghui Li,et al.  On the tradeoff between privacy and utility in data publishing , 2009, KDD.

[5]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[6]  Ruli Manurung,et al.  LinkedLab: A Linked Data platform for Research Communities , 2011, 2011 International Conference on Advanced Computer Science and Information Systems.

[7]  Chiara Francalanci,et al.  Time-Related Factors of Data Quality in Multichannel Information Systems , 2003, J. Manag. Inf. Syst..

[8]  Zheng Shao,et al.  Hive - a petabyte scale data warehouse using Hadoop , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[9]  Josiane Xavier Parreira,et al.  The Role of RDF Stream Processing in an Smart City ICT Infrastructure - The Aspern Smart City Use Case , 2015, ESWC.

[10]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[11]  Kirk Sattley Letters to the editor: corrections to Sattley paper in January communications , 1961, CACM.

[12]  Feng Gao,et al.  Semantic Discovery and Integration of Urban Data Streams , 2014, S4SC@ISWC.

[13]  Thomas Redman,et al.  Data quality for the information age , 1996 .

[14]  Bernhard Haslhofer,et al.  The OAI2LOD Server: Exposing OAI-PMH Metadata as Linked Data , 2008, LDOW.

[15]  Jens Lehmann,et al.  Quality assessment for Linked Data: A Survey , 2015, Semantic Web.

[16]  María Bermúdez-Edo,et al.  Challenges for Quality of Data in Smart Cities , 2015, ACM J. Data Inf. Qual..

[17]  J. Millard,et al.  Mapping Smart Cities in the EU , 2014 .

[18]  Richard Y. Wang,et al.  Anchoring data quality dimensions in ontological foundations , 1996, CACM.

[19]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[20]  Pierangela Samarati,et al.  Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.

[21]  Deborah L. McGuinness,et al.  Contextual Data Collection for Smart Cities , 2015, S4SC@ISWC.

[22]  Guillermo Navarro-Arribas,et al.  User k-anonymity for privacy preserving data mining of query logs , 2012, Inf. Process. Manag..

[23]  Amit P. Sheth,et al.  Semantic Modelling of Smart City Data , 2014 .

[24]  Alexandre de Streel,et al.  Optimal regulatory model for telecommunications services in the EU: study for the IMCO Committee : European Parliament : directorate-general for internal policies, policy department A : economic and scientific policy , 2017 .

[25]  R. P. Srivastava,et al.  A conceptual framework and belief‐function approach to assessing overall information quality , 2003, Int. J. Intell. Syst..

[26]  Claudio Carpineto,et al.  KΘ-affinity privacy: Releasing infrequent query refinements safely , 2015, Inf. Process. Manag..

[27]  J. Li,et al.  Smart city and the applications , 2011, 2011 International Conference on Electronics, Communications and Control (ICECC).

[28]  A. Glasmeier,et al.  Thinking about smart cities , 2015 .

[29]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[30]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[31]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[32]  Emu Reforms,et al.  DIRECTORATE GENERAL FOR INTERNAL POLICIES POLICY DEPARTMENT A: ECONOMIC AND SCIENTIFIC POLICY , 2012 .

[33]  Sneha A. Dalvi,et al.  Internet of Things for Smart Cities , 2017 .

[34]  A. Maurino,et al.  Quality Assessment Methodologies for Linked Open Data , 2012 .

[35]  Bradley Malin,et al.  k-Unlinkability: A privacy protection model for distributed data , 2008, Data Knowl. Eng..

[36]  Spyros Kotoulas,et al.  QuerioCity: A Linked Data Platform for Urban Information Management , 2012, International Semantic Web Conference.

[37]  Q. He A Framework for Modeling Privacy Requirements in Role Engineering , 2003 .

[38]  Torben Bach Pedersen,et al.  pygrametl: a powerful programming framework for extract-transform-load programmers , 2009, DOLAP.

[39]  Axel Polleres,et al.  City Data Pipeline - A System for Making Open Data Useful for Cities , 2013, I-SEMANTICS.

[40]  Xiufeng Liu,et al.  Streamlining Smart Meter Data Analytics , 2015 .