Improving the Identifiability of Neural Networks for Bayesian Inference

Accurate inference of the parameters in the highly complex and multi-modal likelihoods of Neural Networks(NNs) is incredibly difficult for any algorithm. In part, this challenge is caused by the significant over-parameterization of the model, resulting in many equivalent solutions and thus a model unidentifiability problem. In this paper, we explore the unidentifiability problem for NNs as it manifests in two ways: arbitrary permutations of the hidden nodes, which we denote as weight-space symmetry, and arbitrary scaling under rectified linear-unit (ReLU) nonlinearites, which we denote as scaling symmetry. We show how these unidentifiabilities pose issues for both Markov Chain Monte Carlo (MCMC) and Variational Inference (VI). Finally, we introduce two reparameterizations of the model in the form of parameter constraints and prove that they resolve the aforementioned unidentifiability issues, showing some experiments and offering implementations in the form of coordinate transforms.