A RESTful interface to pseudonymization services in modern web applications

BackgroundMedical research networks rely on record linkage and pseudonymization to determine which records from different sources relate to the same patient. To establish informational separation of powers, the required identifying data are redirected to a trusted third party that has, in turn, no access to medical data. This pseudonymization service receives identifying data, compares them with a list of already reported patient records and replies with a (new or existing) pseudonym. We found existing solutions to be technically outdated, complex to implement or not suitable for internet-based research infrastructures. In this article, we propose a new RESTful pseudonymization interface tailored for use in web applications accessed by modern web browsers.MethodsThe interface is modelled as a resource-oriented architecture, which is based on the representational state transfer (REST) architectural style. We translated typical use-cases into resources to be manipulated with well-known HTTP verbs. Patients can be re-identified in real-time by authorized users’ web browsers using temporary identifiers. We encourage the use of PID strings for pseudonyms and the EpiLink algorithm for record linkage. As a proof of concept, we developed a Java Servlet as reference implementation.ResultsThe following resources have been identified: Sessions allow data associated with a client to be stored beyond a single request while still maintaining statelessness. Tokens authorize for a specified action and thus allow the delegation of authentication. Patients are identified by one or more pseudonyms and carry identifying fields. Relying on HTTP calls alone, the interface is firewall-friendly. The reference implementation has proven to be production stable.ConclusionThe RESTful pseudonymization interface fits the requirements of web-based scenarios and allows building applications that make pseudonymization transparent to the user using ordinary web technology. The open-source reference implementation implements the web interface as well as a scientifically grounded algorithm to generate non-speaking pseudonyms.

[1]  Andreas Faldum,et al.  An optimal code for patient identifiers , 2005, Comput. Methods Programs Biomed..

[2]  Sam Ruby,et al.  RESTful Web Services , 2007 .

[3]  Wolfgang Ahrens,et al.  The German National Cohort: Aims, study des , 2014 .

[4]  Peter Christen,et al.  Data Matching , 2012, Data-Centric Systems and Applications.

[5]  P Crosignani,et al.  The EpiLink Record Linkage Software , 2005, Methods of Information in Medicine.

[6]  Roy Fielding,et al.  Architectural Styles and the Design of Network-based Software Architectures"; Doctoral dissertation , 2000 .

[7]  New S Tudy The German National Cohort: aims, study design and organization , 2014 .

[8]  Martin Lablans,et al.  OSSE – open source registry software solution , 2014, Orphanet Journal of Rare Diseases.

[9]  Rich Salz,et al.  A Universally Unique IDentifier (UUID) URN Namespace , 2005, RFC.

[10]  German National Cohort Consortium,et al.  The German National Cohort: aims, study design and organization , 2014, European Journal of Epidemiology.

[11]  Klaus Pommerening,et al.  Personal identifiers in medical research networks: evaluation of the personal identifier generator in the Competence Network Paediatric Oncology and Haematology , 2006 .

[12]  J. LaFountain Inc. , 2013, American Art.

[13]  Michael Spitzer,et al.  Securing a Web-Based Teleradiology Platform According to German Law and "Best Practices" , 2009, MIE.

[14]  Ahmed K. Elmagarmid,et al.  Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.