Big data landscapes: improving the visualization of machine learning-based clustering algorithms

With the internet, massively heterogeneous data sources need to be understood and classified to provide suitable services to users such as content observation, data exploration, e-commerce, or adaptive learning environments. The key to providing these services is applying machine learning (ML) in order to generate structures via clustering and classification. Due to the intricate processes involved in ML, visual tools are needed to support designing and evaluating the ML pipelines. In this contribution, we propose a comprehensive tool that facilitates the analysis and design of ML-based clustering algorithms using multiple visualization features such as semantic zoom, glyphs, and histograms.