Using First-Order Logic to Query Heterogeneous Internet Data Sources

This paper describes an approach to formulate queries in the language of first order logic over data from disparate sources distributed over a network. The data sources are treated as if they were all in a common database. The data sources may incorporate different stored or computed methods of providing data– web services and REST APIs, XML/JSON repositories, web pages, full featured databases, flat files, etc. We describe the technical foundations and benefits of this approach and compare it to other extant solutions to this problem. This approach enables end users to formulate ad-hoc queries in logic to correlate the data sources. For programmers creating applications, using a logic-based language to specify correlation criteria shortens the development life cycle and reduces cost of maintenance, compared to manually coding the equivalent correlation algorithms in procedural languages.