Bi-directional long short-term memory architecture for person re-identification with modified triplet embedding

Matching a specific person across non-overlapping cameras, known as person re-identification, is an important yet challenging task owing to the intra-class variations of the images from the same person in pose, illumination, and occlusion. Most existing body-parts based deep methods simply concatenate the features or scores obtained from spatial parts and ignore the complex spatial correlation between them. In this paper, we present a bi-directional Long Short-Term Memory (Bi-LSTM) architecture that can process the spatial parts sequentially, and enable the messages of different parts to go through in a bi-directional manner. Therefore, the spatial and contextual visual information can be modeled efficiently by the bi-directional connections and the internal gating function in LSTM. Furthermore, we propose a modified triplet loss to learn more discriminative features to distinguish positive pairs from negative pairs. Experiments on CUHK01 and CUHK03 datasets are carried out to demonstrate the effectiveness of the proposed method.