B-Spline Pdf: A Generalization of Histograms to Continuous Density Models for Generative Audio Networks

Many modern neural networks use histograms to efficiently model continuous random variables. This implies that the parametric space of the multinomial distribution is easier for training large neural networks. In applications like generative audio networks, this approach introduces audible quantization noise to the generated signal. This work presents a novel probability density function (PDF), referred to as B-Spline PDF, that is a direct generalization of histograms to continuous densities while retaining the multinomial parameter space. The latter uses k-th order B-Splines to ensure continuity up to the $(k-1)-\text{th}$ order derivative. B-Spline PDF is amenable for neural network training via closed-form gradients that are easy and fast to compute. For other applications, one may use a novel algorithm, referred to as the Expectation algorithm, to efficiently estimate the model parameters. Further, a novel sample generation algorithm is derived that is fast and simple. The theoretical results, coupled with illustrative examples, suggest that B-Spline PDF may directly replace histograms in many related applications.

[1]  S. Efromovich Orthogonal series density estimation , 2010 .

[2]  W. A. Ericson Introduction to Mathematical Statistics, 4th Edition , 1972 .

[3]  D. W. Scott On optimal and data based histograms , 1979 .

[4]  C. J. Stone,et al.  Large-Sample Inference for Log-Spline Models , 1990 .

[5]  George A. Wright,et al.  Nonparametric density estimation for classes of positive random variables , 1994, IEEE Trans. Inf. Theory.

[6]  Michael Unser,et al.  Splines: a perfect fit for signal and image processing , 1999, IEEE Signal Process. Mag..

[7]  C. J. Stone,et al.  A study of logspline density estimation , 1991 .

[8]  DAVID G. KENDALL,et al.  Introduction to Mathematical Statistics , 1947, Nature.

[9]  I. J. Schoenberg Contributions to the problem of approximation of equidistant data by analytic functions. Part A. On the problem of smoothing or graduation. A first class of analytic approximation formulae , 1946 .

[10]  David L. Neuhoff,et al.  Quantization , 2022, IEEE Trans. Inf. Theory.

[11]  Yannis Agiomyrgiannakis,et al.  Vocaine the vocoder and applications in speech synthesis , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Yoshua Bengio,et al.  SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.

[13]  H. Läuter,et al.  Silverman, B. W.: Density Estimation for Statistics and Data Analysis. Chapman & Hall, London – New York 1986, 175 pp., £12.— , 1988 .