Unsupervised Deep Learning for Scene Recognition

Object and scene recognition are usually studied separately. However, research [2]shows that context from scene recognition can greatly improve object recognition performance. The performance of object detectors can be improved by adding information about the type of scene the object is embeded in. This helps disambiguate the class of an object. Gist is an example of a global image feature descriptor for characterizing the global properties of a scene [3]. The aim of our project is to begin exploring the possibility of learning global scene features which can be later used to improve performance of standard object detectors. The main advantage of learning features automatically is that it is di cult to hand engineer features which capture the full statistical properties of natural scenes.