Further Topics and Research Directions

Data matching is an active research area, with work being done in various research directions. As there is an increased need in many application areas to integrate, share and match data from disparate sources, the basic approach of matching two static databases with well defined attributes in batch mode is becoming increasingly inadequate to solve today’s real-world data matching challenges. This chapter covers several areas that can be of interest to data matching practitioners, as well as to researchers who aim to tackle some of the challenges data matching poses. Section 9.1 is concerned with the process of matching geographical data, such as postcodes or addresses, to locations on a map. The topic of Sect. 9.2 is how to match data that are more complex and less structured compared to well defined database records. The third topic, discussed in Sect. 9.3, deals with how data matching can be accomplished in real-time, a challenge of increasing importance as the requirements of applications in many organisations move towards a data stream environment where query records need to be processed, matched, classified and analysed in real time. The topic of Sect. 9.4 covers the related issue of dynamic databases, where records are constantly being updated, added and removed. Accurate matching in such situations thus becomes more challenging compared to the matching of static databases. How the performance of data matching and deduplication can be increased through the use of parallel and distributed computing platforms is discussed in Sect. 9.5. The last section of this chapter, Sect. 9.6, then provides a list of data matching research topics and challenges compiled from responses received from some of the world’s leading data matching researchers and practitioners.