CertDB: A Practical Data Analysis System on Big Data

With the rapid development of big data technologies, more and more companies focus on the integration and analysis of the large scale data which are produced when the companies run, in order to discovering and solving their inner problems. Therefore, a data analysis system is in urgent need to process the big data generated by heterogeneous data sources (e.g. online and offline structured data, semi-structured data and non-structured data). Unfortunately, there is no integrated tool which can universally collect heterogeneous data sources, accomplish data ETL, storage, analysis, and knowledge output. In this paper, an integrated tool: CertDB is proposed, which can provide one-stop services of data collection, analysis, and knowledge output. The CertDB provides services to both expert and inexperienced data analysts, by proving both graphical BI interface and programming interface.