High clarity speech separation using synchro extracting transform

In an era of ever improving communication technologies, many algorithms have been developed for recovering speech signals which provide more intelligible and listenable signals from their mixtures without any prior information about the signal being separated. Degenerate Unmixing Estimation Technique (DUET) is a Blind Source Separation (BSS) method which is highly suitable for underdetermined conditions wherein the number of sources exceeds number of mixtures. Estimation of mixing parameters which is the crucial part of DUET algorithm is built based on the idea of sparseness of speech signal in Time Frequency (TF) domain. Hence DUET is heavily dependent on the clarity of Time Frequency Representation (TFR) and any interference terms in TF plane will affect its performance adversely. Short Time Fourier Transform (STFT) is used to convert speech signals to TF domain in conventional DUET algorithm.STFT has its own limitations in providing sharpness to TFR due to its inherent characteristics which worsens with noise contamination. The paper presents a method of post-processing based on Synchro Squeezed Transform (SST) and Synchro Extracting Transform (SET) techniques to improve TF resolution of DUET method.The efficiency of these methods are evaluated both qualitatively and quantitatively by visual inspection,Renyi entropy of TFR and objective measures of speech signals.The results indicate how the sharpness of TFR provided by these transforms can improve the ability of signal reconstruction and robustness to noise which in turn improves the clarity of reconstructed signal.

[1]  Giuseppe Aceto,et al.  Mobile Encrypted Traffic Classification Using Deep Learning: Experimental Evaluation, Lessons Learned, and Challenges , 2019, IEEE Transactions on Network and Service Management.

[2]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Giuseppe Aceto,et al.  MIMETIC: Mobile encrypted traffic classification using multimodal deep learning , 2019, Comput. Networks.

[4]  J. Jayakumari,et al.  Degenerate unmixing estimation technique of speech mixtures in real environments using wavelets , 2020, International Journal of Electronics Letters.

[5]  Srdjan Stankovic,et al.  Instantaneous frequency in time-frequency analysis: Enhanced concepts and performance of estimation algorithms , 2014, Digit. Signal Process..

[6]  Dan Xu,et al.  An Improved Time-Frequency Analysis Method for Hydrocarbon Detection Based on EWT and SET , 2017 .

[7]  Shibin Wang,et al.  Matching Demodulation Transform With Application to Feature Extraction of Rotor Rub-Impact Fault , 2013, IEEE Transactions on Instrumentation and Measurement.

[8]  Wenxian Yang,et al.  Wind Turbine Condition Monitoring Based on an Improved Spline-Kernelled Chirplet Transform , 2015, IEEE Transactions on Industrial Electronics.

[9]  Zengqiang Ma,et al.  An Improved Time-Frequency Analysis Method for Instantaneous Frequency Estimation of Rolling Bearing , 2018, Shock and Vibration.

[10]  Gaigai Cai,et al.  Matching Demodulation Transform and SynchroSqueezing in Time-Frequency Analysis , 2014, IEEE Transactions on Signal Processing.

[11]  Özgür Yilmaz,et al.  Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[12]  I. Daubechies,et al.  Synchrosqueezed wavelet transforms: An empirical mode decomposition-like tool , 2011 .

[13]  Birendra Biswal,et al.  Time‐frequency analysis of power quality disturbances using synchroextracting transform , 2020 .

[14]  Sylvain Meignen,et al.  Second-Order Synchrosqueezing Transform or Invertible Reassignment? Towards Ideal Time-Frequency Representations , 2015, IEEE Transactions on Signal Processing.

[15]  Gang Yu,et al.  Synchroextracting Transform , 2017, IEEE Transactions on Industrial Electronics.

[16]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[17]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[18]  W. M. Zhang,et al.  Polynomial Chirplet Transform With Application to Instantaneous Frequency Estimation , 2011, IEEE Transactions on Instrumentation and Measurement.

[19]  Yang Yang,et al.  Multicomponent Signal Analysis Based on Polynomial Chirplet Transform , 2013, IEEE Transactions on Industrial Electronics.

[20]  Mirko van der Baan,et al.  Applications of the synchrosqueezing transform in seismic time-frequency analysis , 2014 .

[21]  Patrick Flandrin,et al.  Improving the readability of time-frequency and time-scale representations by the reassignment method , 1995, IEEE Trans. Signal Process..

[22]  G. Meng,et al.  Spline-Kernelled Chirplet Transform for the Analysis of Signals With Time-Varying Frequency and Its Application , 2012, IEEE Transactions on Industrial Electronics.