Non-Uniform Sampling of Fixed Margin Binary Matrices

Data sets in the form of binary matrices are ubiquitous across scientific domains, and researchers are often interested in identifying and quantifying noteworthy structure. One approach is to compare the observed data to that which might be obtained under a null model. Here we consider sampling from the space of binary matrices which satisfy a set of marginal row and column sums. Whereas existing sampling methods have focused on uniform sampling from this space, we introduce modified versions of two elementwise swapping algorithms which sample according to a non-uniform probability distribution defined by a weight matrix, which gives the relative probability of a one for each entry. We demonstrate that values of zero in the weight matrix, i.e. structural zeros, are generally problematic for swapping algorithms, except when they have special monotonic structure. We explore the properties of our algorithms through simulation studies, and illustrate the potential impact of employing a non-uniform null model using a classic bird habitation dataset.

[1]  N. Gotelli,et al.  NULL MODELS IN ECOLOGY , 1996 .

[2]  A. Rao,et al.  A Markov chain Monte carol method for generating random (0, 1)-matrices with given marginals , 1996 .

[3]  Michael E. Gilpin,et al.  Examination of the “null” model of connor and simberloff for species co-occurrences on Islands , 2004, Oecologia.

[4]  Prasad Tetali,et al.  Simple Markov-chain algorithms for generating bipartite graphs and tournaments , 1997, SODA '97.

[5]  J. Besag,et al.  Generalized Monte Carlo significance tests , 1989 .

[6]  Matthew T. Harrison,et al.  Exact sampling and counting for fixed-margin matrices , 2013, 1301.6635.

[7]  Peter D. Hoff,et al.  A First Course in Bayesian Statistical Methods , 2009 .

[8]  J. Wilson,et al.  Methods for detecting non-randomness in species co-occurrences: a contribution , 1987, Oecologia.

[9]  Daniel Simberloff,et al.  The Assembly of Species Communities: Chance or Competition? , 1979 .

[10]  M. Plummer,et al.  CODA: convergence diagnosis and output analysis for MCMC , 2006 .

[11]  C. J. Carstens Proof of uniform sampling of binary matrices with fixed row sums and column sums for the fast Curveball algorithm. , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  L. Stone,et al.  The checkerboard score and species distributions , 1990, Oecologia.

[13]  Arif Zaman,et al.  Random binary matrices in biogeographical ecology—Instituting a good neighbor policy , 2002, Environmental and Ecological Statistics.

[14]  R. Brualdi Matrices of zeros and ones with fixed row and column sum vectors , 1980 .

[15]  Joel Nishimura,et al.  Configuring Random Graph Models with Fixed Degree Sequences , 2016, SIAM Rev..

[16]  Yuguo Chen,et al.  Sequential Monte Carlo Methods for Statistical Analysis of Tables , 2005 .

[17]  Daniel Simberloff,et al.  The checkered history of checkerboard distributions. , 2013, Ecology.

[18]  Alan Roberts,et al.  Island-sharing by archipelago species , 2004, Oecologia.

[19]  N. Gotelli Null model analysis of species co-occurrence patterns , 2000 .

[20]  C. Martin 2015 , 2015, Les 25 ans de l’OMC: Une rétrospective en photos.

[21]  James G. Sanderson,et al.  Null matrices and the analysis of species co-occurrences , 1998, Oecologia.

[22]  Paul H. Harvey,et al.  NULL MODELS IN ECOLOGY , 1983 .

[23]  Bryan F. J. Manly,et al.  A Note on the Analysis of Species Co‐Occurrences , 1995 .

[24]  James M. Flegal,et al.  Batch means and spectral variance estimators in Markov chain Monte Carlo , 2008, 0811.1729.

[25]  Nicholas J. Gotelli,et al.  Swap and fill algorithms in null model analysis: rethinking the knight's tour , 2001, Oecologia.

[26]  Matthew T. Harrison,et al.  Importance sampling for weighted binary random matrices with specified margins , 2013, 1301.3928.

[27]  Giovanni Strona,et al.  A fast and unbiased procedure to randomize ecological binary matrices with fixed row and column totals , 2014, Nature Communications.

[28]  Dorit S. Hochbaum,et al.  Approximation Algorithms for NP-Hard Problems , 1996 .