Deep learning framework for scene based indoor location recognition

Scene recognition and Object detection have made momentous progress lately. While mobile robotics and drone analysis has already reached its growth apex, for robots, which can interact with humans and the indoor environment, having a sense of discerning indoor scenes or interiors of a building is an added benefit. Although, many approaches have been proposed to detect objects and locations such as Indoor Positioning System (IPS), feature based, content based recognition systems etc., indoor scene recognition is yet to gain ground. This is quite justified since, unlike outdoor scenes, indoor scenes lack distinctive local or global visual substance patterns. This paper proposes a new technique of achieving this goal by considering the data from the scene's RGB and Depth images. With advancement in machine learning methodologies such as neural networks, deep neural networks and Convolutional Neural Networks, the workable accuracy in scene recognition is no longer hypothetical. A Deep CNN framework is used with a transfer learning approach for indoor scene recognition implemented on Tensor Flow (python), using the RGB as well as point cloud data i.e., RGB-D images. With this proposed system of deep CNN model, accuracy is able to reach up to 94.4% on the indoor dataset. Further a comparison of the proposed model performance with that of the digits' GoogLeNet and AlexNet framework is conducted. Also a training of the algorithm on the benchmark NYUv2 dataset and have achieved an accuracy of 75.9% which beats the highest accuracy obtained on that model (64.5%).