ScatterScanner: Data Extraction and Chart Restyling of Scatterplots

Commonly, publications do not expose their underlying data, only simple (often poorly designed) visualizations thereof. Many authors want to source data from other papers, yet if they include the pre-designed visualization they are chained to the design choices of the original author. This is a pain point for authors who want holistic control over the design language of their publication. In order to solve this problem, we present ScatterScanner, a web interface that will process an image, extract the data, and allow the user to modify design choices such as chart form, color, spatial relationships etc. In addition, we provide an option to download the extracted data from the visualization in the form of a .csv file so that the user can save and modify this data in other applications. Due to the

[1]  Larry S. Davis,et al.  Classifying Computer Generated Charts , 2007, 2007 International Workshop on Content-Based Multimedia Indexing.

[2]  R. Smith,et al.  An Overview of the Tesseract OCR Engine , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[3]  Jeffrey Heer,et al.  ReVision: automated classification, analysis and redesign of chart images , 2011, UIST.

[4]  Mitra Basu,et al.  Gaussian-based edge-detection methods - a survey , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[5]  James C. Bezdek,et al.  Core zone scatterplots: A new approach to feature extraction for visual displays , 1988, Comput. Vis. Graph. Image Process..

[6]  Timothée Poisot,et al.  The digitize Package: Extracting Numerical Data from Scatterplots , 2011, R J..