Mismatched Guesswork and One-to-One Codes

We study the problem of mismatched guesswork, where we evaluate the number of symbols $y \in \mathcal{Y}$ which have higher likelihood than $X \sim \mu$ according to a mismatched distribution μ. We discuss the role of the tilted/exponential families of the source distribution μ and of the mismatched distribution ν. We show that the value of guesswork can be characterized using the tilted family of the mismatched distribution v, while the probability of guessing is characterized by an exponential family which passes through μ. Using this characterization, we demonstrate that the mismatched guesswork follows a large deviation principle (LDP), where the rate function is described implicitly using information theoretic quantities. We apply these results to one-to-one source coding (without prefix free constraint) to obtain the cost of mismatch in terms of average codeword length. We show that the cost of mismatch in one-to-one codes is no larger than that of the prefix-free codes, i.e., $D(\mu\Vert\nu)$. Further, the cost of mismatch vanishes if and only if ν lies on the tilted family of the true distribution μ, which is in stark contrast to the prefix-free codes. These results imply that one-to-one codes are inherently more robust to mismatch.

[1]  Rajesh Sundaresan,et al.  Guessing Revisited: A Large Deviations Approach , 2010, IEEE Transactions on Information Theory.

[2]  Oliver Kosut,et al.  Asymptotics and Non-Asymptotics for Universal Fixed-to-Variable Source Coding , 2014, IEEE Transactions on Information Theory.

[3]  Muriel Médard,et al.  Centralized vs decentralized multi-agent guesswork , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[4]  Erdal Arikan An inequality on guessing and its application to sequential decoding , 1996, IEEE Trans. Inf. Theory.

[5]  Neri Merhav,et al.  List Decoding—Random Coding Exponents and Expurgated Exponents , 2013, IEEE Transactions on Information Theory.

[6]  Faramarz Fekri,et al.  Fundamental limits of universal lossless one-to-one compression of parametric sources , 2014, 2014 IEEE Information Theory Workshop (ITW 2014).

[7]  Sergio Verdú,et al.  Cumulant generating function of codeword lengths in optimal lossless compression , 2014, 2014 IEEE International Symposium on Information Theory.

[8]  Neri Merhav,et al.  Guessing Subject to Distortion , 1998, IEEE Trans. Inf. Theory.

[9]  Muriel Médard,et al.  Why Botnets Work: Distributed Brute-Force Attacks Need No Synchronization , 2018, IEEE Transactions on Information Forensics and Security.

[10]  A. Robert Calderbank,et al.  A geometric perspective on guesswork , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[11]  Lizhong Zheng,et al.  I-Projection and the Geometry of Error Exponents , 2006 .

[12]  Ken R. Duffy,et al.  Multi-User Guesswork and Brute Force Security , 2015, IEEE Transactions on Information Theory.

[13]  Ken R. Duffy,et al.  Guesswork, Large Deviations, and Shannon Entropy , 2012, IEEE Transactions on Information Theory.

[14]  Imre Csiszár,et al.  Information Theory and Statistics: A Tutorial , 2004, Found. Trends Commun. Inf. Theory.

[15]  J. Massey Guessing and entropy , 1994, Proceedings of 1994 IEEE International Symposium on Information Theory.

[16]  Neri Merhav,et al.  Universal Randomized Guessing With Application to Asynchronous Decentralized Brute–Force Attacks , 2020, IEEE Transactions on Information Theory.

[17]  Ken R. Duffy,et al.  A Characterization of Guesswork on Swiftly Tilting Curves , 2018, IEEE Transactions on Information Theory.

[18]  Litian Liu,et al.  Mismatched Guesswork , 2019, ArXiv.

[19]  Wojciech Szpankowski,et al.  A One-to-One Code and Its Anti-Redundancy , 2008, IEEE Transactions on Information Theory.

[20]  Rajesh Sundaresan,et al.  Guessing Under Source Uncertainty , 2006, IEEE Transactions on Information Theory.

[21]  Ken R. Duffy,et al.  Guesswork subject to a total entropy budget , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22]  Neri Merhav,et al.  Erasure/List Random Coding Error Exponents Are Not Universally Achievable , 2016, IEEE Transactions on Information Theory.

[23]  A. Robert Calderbank,et al.  Quantifying computational security subject to source constraints, guesswork and inscrutability , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).