Support Vector Machines to Weight Voters in a Voting System of Entity Extractors

Support vector machines are used to combine the outputs of multiple entity extractors, thus creating a composite entity extraction system. The composite system has a significantly higher f-measure than any of the component systems. Compared to a standard voting technique for combining the results of multiple entity extractors, the SVM approach produces comparable precision and recall statistics but tends to utilize fewer of the component entity extractors, thus providing superior computational efficiency, which is critical in practical applications. In this paper, we present our experimental results of comparing a standard voting technique with SVM that each aggregate four entity extractors. We also describe our future plans of integrating agent-based technology into our experimental testbed where we examine the evolution of composite techniques as part of the analysis stream. Given that much of the improvement comes from tuning the algorithms to the data stream with a human-in-the-loop, we are considering the merits of employing cognitive agents that are strategically embedded in the workflow for processing data. As we tune the algorithms for better performance on the data streams, we envision agents learning the patterns of data streams and apply the appropriate tuning to ensure optimality.