Improving Data Management Applications Using Microtask Platforms

Many data management problems are inherently vague and hard for algorithms to process. Take for example entity resolution, also known as record linkage, the process to resolve records for the same entity from heterogeneous sources. Properly resolving such records require not only the syntactic structure of the data, but also contextual semantics that are hard for machines to understand. To properly perform such data management tasks requires human inputs for providing information that is missing from the structured data that machines can read, for performing computationally dicult functions, and for matching, ranking, or aggregating results based on fuzzy criteria.