Deep learning powered automated tool for generating image based datasets

During the last decade, rapid growth of the social media usage and the Internet in general led to the drastic expansion of the publicly available data. This gives the great opportunity for research and analysis in many fields of machine learning and data science. Even though this data is open to use, often it is distributed over the various sources on the web and in order to be used in data science experiments it requires to be gathered, properly labeled and pre-processed. This is the essential step of every research and it is generally time consuming. In this paper, the automated system for gathering and preparing the dataset is proposed. The system consists of numerous features which embrace cutting edge approaches in order to reduce the time for developing the dataset. It is primarily developed for image based datasets, and provides possibility for researchers to collect the images from the different sources: social media by official APIs or online image libraries using web scraping techniques. Along with the data gathering, proposed system provides an intelligent image labeling feature by using state-of-the-art deep learning methods which use trained models for object recognition and label them in standardized formats. The system is modular and could be extended for desired objects detection by simply adding new trained models. Finally, it provides the possibility to speed up the researchers by setting their focus more on the essential parts of their studies than to generate the dataset by themselves.