Learning-Based Cleansing for Indoor RFID Data

RFID is widely used for object tracking in indoor environments, e.g., airport baggage tracking. Analyzing RFID data offers insight into the underlying tracking systems as well as the associated business processes. However, the inherent uncertainty in RFID data, including noise (cross readings) and incompleteness (missing readings), pose challenges to high-level RFID data querying and analysis. In this paper, we address these challenges by proposing a learning-based data cleansing approach that, unlike existing approaches, requires no detailed prior knowledge about the spatio-temporal properties of the indoor space and the RFID reader deployment. Requiring only minimal information about RFID deployment, the approach learns relevant knowledge from raw RFID data and uses it to cleanse the data. In particular, we model raw RFID readings as time series that are sparse because the indoor space is only partly covered by a limited number of RFID readers. We propose the Indoor RFID Multi-variate Hidden Markov Model (IR-MHMM) to capture the uncertainties of indoor RFID data as well as the correlation of moving object locations and object RFID readings. We propose three state space design methods for IR-MHMM that enable the learning of parameters while contending with raw RFID data time series. We solely use raw uncleansed RFID data for the learning of model parameters, requiring no special labeled data or ground truth. The resulting IR-MHMM based RFID data cleansing approach is able to recover missing readings and reduce cross readings with high effectiveness and efficiency, as demonstrated by extensive experimental studies with both synthetic and real data. Given enough indoor RFID data for learning, the proposed approach achieves a data cleansing accuracy comparable to or even better than state-of-the-art techniques requiring very detailed prior knowledge, making our solution superior in terms of both effectiveness and employability.

[1]  David A. Maltz,et al.  Dynamic Source Routing in Ad Hoc Wireless Networks , 1994, Mobidata.

[2]  Li Xiao-guang Cleaning Method of RFID Data Stream Based on Kalman Filter , 2011 .

[3]  Ying Hu,et al.  Supporting RFID-based Item Tracking Applications in Oracle DBMS Using a Bitmap Datatype , 2005, VLDB.

[4]  Sudarshan S. Chawathe,et al.  Managing RFID Data , 2004, VLDB.

[5]  Oleksandr Mylyy RFID Data Management , Aggregation and Filtering , 2007 .

[6]  Juan A. Botía Blaya,et al.  Improving RFID's Location Based Services by means of Hidden Markov Models , 2010, ECAI.

[7]  Prashant J. Shenoy,et al.  Probabilistic Inference over RFID Streams in Mobile Environments , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[8]  Hua Lu,et al.  Handling False Negatives in Indoor RFID Data , 2014, 2014 IEEE 15th International Conference on Mobile Data Management.

[9]  Beng Chin Ooi,et al.  Efficient RFID Data Imputation by Analyzing the Correlations of Monitored Objects , 2009, DASFAA.

[10]  Filippo Furfaro,et al.  Cleaning trajectory data of RFID-monitored objects through conditioning under integrity constraints , 2014, EDBT.

[11]  Bernard Menezes KReSIT RFID Data Management , 2006 .

[12]  Hua Lu,et al.  Spatiotemporal Data Cleansing for Indoor RFID Tracking Data , 2013, 2013 IEEE 14th International Conference on Mobile Data Management.

[13]  Zhanhuai Li,et al.  Probabilistic Modeling of Streaming RFID Data by Using Correlated Variable-duration HMMs , 2009, 2009 Seventh ACIS International Conference on Software Engineering Research, Management and Applications.

[14]  Peter S. Fader,et al.  An Exploratory Look at Supermarket Shopping Paths , 2005 .

[15]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[16]  Diego Klabjan,et al.  Warehousing and Analyzing Massive RFID Data Sets , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[17]  Herman Vermaak,et al.  Reducing False Negative Reads in RFID Data Streams Using an Adaptive Sliding-Window Approach , 2012, Sensors.

[18]  Minos N. Garofalakis,et al.  Adaptive cleaning for RFID data streams , 2006, VLDB.

[19]  Filippo Furfaro,et al.  Offline cleaning of RFID trajectory data , 2014, SSDBM '14.

[20]  Fusheng Wang,et al.  Temporal Management of RFID Data , 2005, VLDB.

[21]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[22]  Haixun Wang,et al.  Leveraging spatio-temporal redundancy for RFID data cleansing , 2010, SIGMOD Conference.

[23]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[24]  Younès Bennani,et al.  Mining RFID Behavior Data using Unsupervised Learning , 2010, Int. J. Appl. Logist..

[25]  Padhraic Smyth,et al.  Modeling of multivariate time series using hidden markov models , 2005 .

[26]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB Journal.