The Complexity of Three-Way Statistical Tables

Multiway tables with specified marginals arise in a variety of applications in statistics and operations research. We provide a comprehensive complexity classification of three fundamental computational problems on tables: existence, counting, and entry-security. One outcome of our work is that each of the following problems is intractable already for "slim" 3-tables, with constant number 3 of rows: (1) deciding existence of 3-tables with specified 2-marginals; (2) counting all 3-tables with specified 2-marginals; (3) deciding whether a specified value is attained in a specified entry by at least one of the 3-tables having the same 2-marginals as a given table. This implies that a characterization of feasible marginals for such slim tables, sought by much recent research, is unlikely to exist. Another consequence of our study is a systematic efficient way of embedding the set of 3-tables satisfying any given 1-marginals and entry upper bounds in a set of slim 3-tables satisfying suitable 2-marginals with no entry bounds. This provides a valuable tool for studying multi-index transportation problems and multi-index transportation polytopes. Remarkably, it enables us to automatically recover a famous example due to Vlach of a "real-feasible integer-infeasible" collection of 2-marginals for 3-tables of smallest possible size (3,4,6).

[1]  Milan Vlach,et al.  Conditions for the existence of solutions of the three-dimensional planar transportation problem , 1986, Discret. Appl. Math..

[2]  Jesús A. De Loera,et al.  Algebraic unimodular counting , 2001, Math. Program..

[3]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[4]  Mark Jerrum,et al.  Three-Dimensional Statistical Data Security Problems , 1994, SIAM J. Comput..

[5]  Alexander I. Barvinok,et al.  A polynomial time algorithm for counting integral points in polyhedra when the dimension is fixed , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[6]  Seth Sullivant,et al.  Gröbner Bases and Polyhedral Geometry of Reducible and Cyclic Models , 2002, J. Comb. Theory, Ser. A.

[7]  Ramayya Krishnan,et al.  Disclosure Limitation Methods and Information Loss for Tabular Data , 2001 .

[8]  S E Fienberg,et al.  INAUGURAL ARTICLE by a Recently Elected Academy Member:Bounds for cell entries in contingency tables given marginal totals and decomposable graphs , 2000 .

[9]  P. Doyle,et al.  Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies , 2001 .

[10]  Leonard J. Schulman,et al.  The Vector Partition Problem for Convex Objective Functions , 2001, Math. Oper. Res..

[11]  Martin E. Dyer,et al.  Sampling contingency tables , 1997 .

[12]  P. Diaconis,et al.  Testing for independence in a two-way table , 1985 .

[13]  Dan Gusfield,et al.  A little knowledge goes a long way: faster detection of compromised data in 2-D tables , 1990, Proceedings. 1990 IEEE Computer Society Symposium on Research in Security and Privacy.

[14]  L. Cox On properties of multi-dimensional statistical tables , 2003 .

[15]  Dan Gusfield,et al.  A Graph Theoretic Approach to Statistical Data Security , 1988, SIAM J. Comput..

[16]  V. A. Yemelicher,et al.  Polytopes, Graphs and Optimisation , 1984 .

[17]  R. W. Burgess,et al.  BUREAU OF THE CENSUS , 1992 .

[18]  Josep Domingo-Ferrer,et al.  Inference Control in Statistical Databases, From Theory to Practice , 2002 .

[19]  Noga Alon,et al.  Separable Partitions , 1999, Discret. Appl. Math..

[20]  J. Humphreys Polytopes, Graphs and Optimisation , 2022 .

[21]  P. Diaconis,et al.  Rectangular Arrays with Fixed Margins , 1995 .

[22]  Nitin R. Patel,et al.  A Network Algorithm for Performing Fisher's Exact Test in r × c Contingency Tables , 1983 .

[23]  Uriel G. Rothblum,et al.  Convex Combinatorial Optimization , 2003, Discret. Comput. Geom..

[24]  Uriel G. Rothblum,et al.  A Polynomial Time Algorithm for Shaped Partition Problems , 1999, SIAM J. Optim..

[25]  Leslie G. Valiant,et al.  The Complexity of Computing the Permanent , 1979, Theor. Comput. Sci..

[26]  Hendrik W. Lenstra,et al.  Integer Programming with a Fixed Number of Variables , 1983, Math. Oper. Res..

[27]  S. Aviran,et al.  Momentopes, the Complexity of Vector Partitioning, and Davenport—Schinzel Sequences , 2002, Discret. Comput. Geom..