Making Legacy Open Data Machine Readable by Crowdsourcing
暂无分享,去创建一个
An approach is described for converting legacy statistical data in image format into a machine-readable and reusable format by using crowdsourcing. Requesting crowd workers not only to extract tables from graph images but also to reconstruct them in spreadsheets can produces structures including attribute names and values as properties of the reconstructed graph objects. A quality control mechanism was developed that improves the accuracy of extracted tables by aggregating tables created by different workers for the same chart image and by utilizing the data structures obtained from the reproduced chart objects. Experimental results using the White Paper on Tourism published by the Japan Tourism Agency demonstrated that the proposed approach is effective.
[1] Hisashi Kashima,et al. From one star to three stars: Upgrading legacy open data using crowdsourcing , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).
[2] Hugh Glaser,et al. Linked Open Government Data: Lessons from Data.gov.uk , 2012, IEEE Intelligent Systems.