Bayes Linear Methods for Large-Scale Network Search

Consider the problem of searching a large set of items, such as emails, for a small set which are relevant to a given query. This can be implemented in a sequential manner whereby we use knowledge from earlier items that we have screened to help us choose future items in an informed way. Often the items we are searching have an underlying network structure: for example emails can be related to a network of participants, where an edge in the network relates to the presence of a communication between those two participants. Recent work by Dimitrov, Kress and Nevo has shown that using the information about the network structure together with a modelling assumption that relevant items and participants are likely to cluster together, can greatly increase the rate of screening relevant items. However their approach is computationally expensive and thus limited in applicability to small networks. Here we show how Bayes Linear methods provide a natural approach to modelling such data; that they output posterior summaries that are most relevant to heuristic policies for choosing future items; and that they can easily be applied to large-scale networks. Both on simulated data, and data from the Enron Corpus, Bayes Linear approaches are shown to be applicable to situations where the method of Dimitrov et al. is infeasible; and give substantially better performance than methods that ignore the network structure.

[1]  Po-Ling Loh,et al.  Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses , 2012, NIPS.

[2]  Aric Hagberg,et al.  Exploring Network Structure, Dynamics, and Function using NetworkX , 2008, Proceedings of the Python in Science Conference.

[3]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[4]  Aurélien Garivier,et al.  The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.

[5]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[6]  Aurélien Garivier,et al.  On Bayesian Upper Confidence Bounds for Bandit Problems , 2012, AISTATS.

[7]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[8]  T. Lai Adaptive treatment allocation and the multi-armed bandit problem , 1987 .

[9]  B. M. Golam Kibria Bayes Linear Statistics: Theory and Methods , 2008 .

[10]  Duncan R Ellis Algorithms for Efficient Intelligence Collection , 2013 .

[11]  Isabelle Duyvesteyn,et al.  The future of intelligence: challenges in the 21st century , 2014 .

[12]  Richard J. Hughbank,et al.  Intelligence and Its Role in Protecting Against Terrorism , 2010 .

[13]  Yuval Nevo Information Selection in Intelligence Processing , 2011 .

[14]  Leslie Pack Kaelbling,et al.  Learning in embedded systems , 1993 .

[15]  David S. Leslie,et al.  Optimistic Bayesian Sampling in Contextual-Bandit Problems , 2012, J. Mach. Learn. Res..

[16]  Moshe Kress,et al.  Finding the needles in the haystack: efficient intelligence processing , 2016, J. Oper. Res. Soc..

[17]  Nevin L. Zhang,et al.  A simple approach to Bayesian network computations , 1994 .

[18]  Ross D. Shachter,et al.  Global Conditioning for Probabilistic Inference in Belief Networks , 1994, UAI.