Knowledge Graphs for Social Good: An Entity-Centric Search Engine for the Human Trafficking Domain

Web advertising related to Human Trafficking (HT) activity has been on the rise in recent years. Answering entity-centric questions over crawled HT Web corpora to assist investigators in the real world is an important social problem, involving many technical challenges. This paper describes a recent entity-centric knowledge graph effort that resulted in a semantic search engine to assist analysts and investigative experts in the HT domain. The overall approach takes as input a large corpus of advertisements crawled from the Web, structures it into an indexed knowledge graph, and enables investigators to satisfy their information needs by posing investigative search queries to a special-purpose semantic execution engine. We evaluated the search engine on real-world data collected from over 90,000 webpages, a significant fraction of which correlates with HT activity. Performance on four relevant categories of questions on a mean average precision metric were found to be promising, outperforming a learning-to-rank approach on three of the four categories. The prototype uses open-source components and scales to terabyte-scale corpora. Principles of the prototype have also been independently replicated, with similarly successful results.