Mining the Best Observational Window to Model Social Phenomena

The structure and behavior of organizations can be learned by mining the event logs of the information systems they manage. This supports numerous applications, such as inferring the structure of social relations, uncovering implicit workflows, and detecting illicit behavior. However, to date, no clear guidelines regarding how to select an appropriate time period to perform organizational modeling have been articulated. This is a significant concern because an inaccurately defined period can lead to incorrect models and poor performance in data-driven applications. In this paper, we introduce a data-driven approach to infer the optimal time period for organizational modeling. Our approach 1) represents the system as a social network, 2) decomposes it into its respective principal components, and 3) optimizes the signal-to-noise ratio over varying temporal observation windows. In doing so, we minimize the variance in the organizational structure while maximizing its patterns. We assess the capability of this approach using an anomaly detection scenario, which is based on the patterns learned from the interactions documented in audit logs. The classification performance of two known algorithms is investigated over a range of time periods in two representative datasets. First, we use the electronic health record access logs from Northwestern Memorial Hospital to demonstrate that our framework detects a period that coincides with the optimal performance of the anomaly detection algorithms. Second, we assess the generalizability of the framework through an analysis with a less clearly defined organization, in the form of the social network inferred from the DBLP co-authorship dataset. The results with this data further illustrate that our framework can discover the optimal time period in the context of a more loosely organized group.

[1]  Abdul V. Roudsari,et al.  Computerization of workflows, guidelines, and care pathways: a review of implementation challenges for process-oriented health information systems , 2011, J. Am. Medical Informatics Assoc..

[2]  Wen Zhang,et al.  Decide Now or Decide Later?: Quantifying the Tradeoff between Prospective and Retrospective Access Decisions , 2014, CCS.

[3]  Age K. Smilde,et al.  Principal Component Analysis , 2003, Encyclopedia of Machine Learning.

[4]  Ashraful Alam,et al.  A study of physician collaborations through social network and exponential random graph , 2013, BMC Health Services Research.

[5]  Stefano Tasselli,et al.  Social Networks of Professionals in Health Care Organizations , 2014, Medical care research and review : MCRR.

[6]  Wen Zhang,et al.  Specializing network analysis to detect anomalous insider actions , 2012, Security Informatics.

[7]  Yevgeniy Vorobeychik,et al.  A Game-Theoretic Approach for Alert Prioritization , 2017, AAAI Workshops.

[8]  Wen Zhang,et al.  Evolving role definitions through permission invocation patterns , 2013, SACMAT '13.

[9]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[10]  Young Ji Lee,et al.  Visualizing collaborative electronic health record usage for hospitalized patients with heart failure , 2015, J. Am. Medical Informatics Assoc..

[11]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[12]  Wil M. P. van der Aalst,et al.  Process Mining Techniques: an Application to Stroke Care , 2008, MIE.

[13]  Bradley Malin,et al.  Learning relational policies from electronic health record access logs , 2011, J. Biomed. Informatics.

[14]  Kai Zheng,et al.  Social networks and physician adoption of electronic health records: insights from an empirical study , 2010, J. Am. Medical Informatics Assoc..

[15]  Yizhou Sun,et al.  On community outliers and their efficient detection in information networks , 2010, KDD.

[16]  Bradley Malin,et al.  Identifying collaborative care teams through electronic medical record utilization patterns , 2017, J. Am. Medical Informatics Assoc..

[17]  He Zhang,et al.  Inferring Clinical Workflow Efficiency via Electronic Medical Record Utilization , 2015, AMIA.

[18]  Chao Yan,et al.  Predicting Neonatal Encephalopathy From Maternal Data in Electronic Medical Records , 2018, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[19]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Bo Li,et al.  Learning Clinical Workflows to Identify Subgroups of Heart Failure Patients , 2016, AMIA.

[21]  Piotr Sapiezynski,et al.  Measuring Large-Scale Social Networks with High Resolution , 2014, PloS one.

[22]  Jian Zhao,et al.  egoSlider: Visual Analysis of Egocentric Network Evolution , 2016, IEEE Transactions on Visualization and Computer Graphics.

[23]  T. Murdoch,et al.  The inevitable application of big data to health care. , 2013, JAMA.

[24]  Bradley Malin,et al.  Detecting Anomalous Insiders in Collaborative Information Systems , 2012, IEEE Transactions on Dependable and Secure Computing.

[25]  Kaoru Tone,et al.  Network DEA: A slacks-based measure approach , 2009, Eur. J. Oper. Res..

[26]  Bradley Malin,et al.  We work with them? Healthcare workers interpretation of organizational relations mined from electronic health records , 2014, Int. J. Medical Informatics.

[27]  Diogo R. Ferreira,et al.  Business process analysis in healthcare environments: A methodology based on process mining , 2012, Inf. Syst..

[28]  J. Grimshaw,et al.  Knowledge translation of research findings , 2012, Implementation Science.

[29]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[30]  Bradley Malin,et al.  Auditing Medical Records Accesses via Healthcare Interaction Networks , 2012, AMIA.

[31]  Bo Li,et al.  Get Your Workload in Order: Game Theoretic Prioritization of Database Auditing , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[32]  S. Uddin,et al.  Effects of physician collaboration network on hospital outcomes , 2012 .

[33]  M. Eccles,et al.  Planning and Studying Improvement in Patient Care: The Use of Theoretical Perspectives , 2007, The Milbank quarterly.

[34]  Dongwon Lee,et al.  On six degrees of separation in DBLP-DB and more , 2005, SGMD.