Entity resolution using inferred relationships and behavior

We present a method for entity resolution that infers relationships between observed identities and uses those relationships to aid in mapping identities to underlying entities. We also introduce the idea of using graphlets for entity resolution. Graphlets are collections of small graphs that can be used to characterize the “role” of a node in a graph. The idea is that graphlets can provide a richer set of features to characterize identities. We validate our method on standard author datasets, and we further evaluate our method using data collected from Twitter. We find that inferred relationships and graphlets are useful for entity resolution.

[1]  Lior Rokach,et al.  Entity Matching in Online Social Networks , 2013, 2013 International Conference on Social Computing.

[2]  Bartunov Sergey,et al.  Joint Link-Attribute User Identity Resolution in Online Social Networks , 2012 .

[3]  Ravindra K. Ahuja,et al.  Very large-scale neighborhood search , 2000 .

[4]  Janez Demsar,et al.  A combinatorial approach to graphlet counting , 2014, Bioinform..

[5]  Lise Getoor,et al.  Collective entity resolution in relational data , 2007, TKDD.

[6]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[7]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[8]  Tijana Milenkoviæ,et al.  Uncovering Biological Network Function via Graphlet Degree Signatures , 2008, Cancer informatics.

[9]  Abraham P. Punnen,et al.  Very Large-Scale Neighborhood Search , 2000, Handbook of Approximation Algorithms and Metaheuristics.

[10]  Pedro M. Domingos,et al.  Entity Resolution with Markov Logic , 2006, Sixth International Conference on Data Mining (ICDM'06).

[11]  Celso C. Ribeiro,et al.  Greedy Randomized Adaptive Search Procedures: Advances, Hybridizations, and Applications , 2010 .

[12]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[13]  Eric McDermid,et al.  Identifying groups of interest through temporal analysis and event response monitoring , 2013, 2013 IEEE International Conference on Intelligence and Security Informatics.