Computing road signatures from cell sequences with minimum inconsistencies

Using Call Detail Records (CDR) to track mobile locations is becoming increasingly popular. In this paper, we design algorithms to compute a road signature, which is a typical sequence of cell sites that a mobile is connected to when traveling on the road. A good signature provides an easy way to determine whether other mobiles travel on the road in the future. The input to our signature creation problem is a set of cell sequences. We capture the ordering of cells in a directed graph G, where a directed arc (u, v) in G corresponds to cell u appearing before cell v in some input sequence. The output signature is an ordering of all nodes in G, ideally respecting all input sequences. However, if the input contains inconsistencies, namely when G contains cycles, it is impossible to create a consistent signature. We aim to create a signature that minimizes the amount of inconsistency. If 1-position inconsistency is feasible, meaning only neighboring cell sites in the signature can be out of order with respect to the input, we provide a polynomial-time combinatorial algorithm to compute such a signature. Otherwise, inconsistency is more than 1 position. In this case we present a more general algorithm via dynamic programming that minimizes stretch, which means any neighboring cells in the input are placed as close as possible in the signature. This generalization includes b-position inconsistency. Finally, we apply our algorithms to a set of CDR traces to illustrate our results.