Multi-stream Deep Networks for Vehicle Make and Model Recognition

Vehicle recognition generally aims to classify vehicles based on make, model and year of manufacture. It is a particularly hard problem due to the large number of classes and small inter-class variations. To handle this problem recent state of the art methods use Convolutional Neural Network (CNN). These methods have however several limitations since they extract unstructured vehicle features used for the recognition task. In this paper, we propose more structured feature extraction method by leveraging robust multi-stream deep networks architecture. We employ a novel dynamic combination technique to aggregate different vehicle part features with the entire image. This allows combining global representation with local features. Our system which has been evaluated on publicly available datasets is able to learn highly discriminant representation and achieves state-of-the-art result.