Invariance and stability of Gabor scattering for music signals

A feature extractor based on Gabor frames and Mallat's scattering transform is introduced. The resulting Gabor scattering is applied to a simple model for audio signals in order to study invariance properties and deformation stability. In particular, it is shown that different layers create invariance to certain signal features. The decoupling technique previously used to investigate deformation stability of scattering transforms for Cartoon functions is applied to investigate to which extent the feature extractor is robust to changes in spectral shape and frequency modulation. The results are illustrated by numerical examples.

[1]  Helmut Bölcskei,et al.  Deep convolutional neural networks on cartoon functions , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[2]  Helmut Bölcskei,et al.  Deep convolutional neural networks based on semi-discrete frames , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[3]  Karlheinz Gröchenig,et al.  Foundations of Time-Frequency Analysis , 2000, Applied and numerical harmonic analysis.

[4]  Joakim Andén,et al.  Deep Scattering Spectrum , 2013, IEEE Transactions on Signal Processing.

[5]  Joakim Andén,et al.  Joint time-frequency scattering for audio classification , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).

[6]  Monika Dörfler,et al.  Gabor frames and deep scattering networks in audio processing , 2019, Axioms.

[7]  Thomas Wiatowski,et al.  A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction , 2015, IEEE Transactions on Information Theory.

[8]  Stéphane Mallat,et al.  Group Invariant Scattering , 2011, ArXiv.