论文信息 - Audio Signal Processing Using Time-Frequency Approaches: Coding, Classification, Fingerprinting, and Watermarking

Audio Signal Processing Using Time-Frequency Approaches: Coding, Classification, Fingerprinting, and Watermarking

Audio signals are information rich nonstationary signals that play an important role in our day-to-day communication, perception of environment, and entertainment. Due to its non-stationary nature, time- or frequency-only approaches are inadequate in analyzing these signals. A joint time-frequency (TF) approach would be a better choice to efficiently process these signals. In this digital era, compression, intelligent indexing for content-based retrieval, classification, and protection of digital audio content are few of the areas that encapsulate a majority of the audio signal processing applications. In this paper, we present a comprehensive array of TF methodologies that successfully address applications in all of the above mentioned areas. A TF-based audio coding scheme with novel psychoacoustics model, music classification, audio classification of environmental sounds, audio fingerprinting, and audio watermarking will be presented to demonstrate the advantages of using time-frequency approaches in analyzing and extracting information from audio signals.

[1] Irena Orovic,et al. An Application of Multidimensional Time-Frequency Analysis as a Base for the Unified Watermarking Approach , 2010, IEEE Transactions on Image Processing.

[2] Jin Woo Hong,et al. Audio watermarking for copyright protection of digital audio data , 2001 .

[3] Keinosuke Fukunaga,et al. Introduction to Statistical Pattern Recognition , 1972 .

[4] P. Tse,et al. An improved Hilbert–Huang transform and its application in vibration signal analysis , 2005 .

[5] Patrick J. Loughlin,et al. Time-frequency-based classification , 1996, Optics & Photonics.

[6] Dale Groutage,et al. Feature sets for nonstationary signals derived from moments of the singular value decomposition of Cohen-Posch (positive time-frequency) distributions , 2000, IEEE Trans. Signal Process..

[7] Hrishikesh Deshpande,et al. CLASSIFICATION OF MUSIC SIGNALS IN THE VISUAL DOMAIN , 2001 .

[8] Paris Smaragdis,et al. Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs , 2004, ICA.

[9] R.D. Dony,et al. Audio Environment Classication for Hearing Aids using Artificial Neural Networks with Windowed Input , 2007, 2007 IEEE Symposium on Computational Intelligence in Image and Signal Processing.

[10] Karthikeyan Umapathy,et al. Audio Signal Feature Extraction and Classification Using Local Discriminant Bases , 2004, IEEE Transactions on Audio, Speech, and Language Processing.

[11] Michael Arnold. Audio watermarking: features, applications and algorithms , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[12] Georgios Tziritas,et al. A speech/music discriminator based on RMS and zero-crossings , 2005, IEEE Transactions on Multimedia.

[13] Sridhar Krishnan,et al. Audio feature clustering for hearing aid systems , 2009, 2009 IEEE Toronto International Conference Science and Technology for Humanity (TIC-STH).

[14] Nazim Fatès,et al. StirMark benchmark: audio watermarking attacks , 2001, Proceedings International Conference on Information Technology: Coding and Computing.

[15] Patrik O. Hoyer,et al. Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[16] A. Spanias,et al. Perceptual coding of digital audio , 2000, Proceedings of the IEEE.

[17] Karthikeyan Umapathy,et al. Multigroup classification of audio signals using time-frequency parameters , 2005, IEEE Transactions on Multimedia.

[18] H. Sebastian Seung,et al. Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[19] Thomas Rydén. Using Listening Tests to Assess Audio Codecs , 1996 .

[20] Chih-Jen Lin,et al. Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[21] M. J. Norušis,et al. SPSS advanced statistics user's guide , 1990 .

[22] Thomas Sikora,et al. Audio classification based on MPEG-7 spectral basis representations , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[23] Sridhar Krishnan,et al. Chirp-Based Image Watermarking as Error-Control Coding , 2006, 2006 International Conference on Intelligent Information Hiding and Multimedia.

[24] Karthikeyan Umapathy,et al. Audio Coding and Classification: Principles and Algorithms , 2009 .

[25] Edward A. Lee,et al. Adaptive Signal Models: Theory, Algorithms, and Audio Applications , 1998 .

[26] Hagen Soltau,et al. Recognition of music types , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[27] S. Aign,et al. Overview of the MPEG-4 Standard and Error Resilience Investigations , 1998 .

[28] Sridhar Krishnan. Instantaneous mean frequency estimation using adaptive time-frequency distributions , 2001, Canadian Conference on Electrical and Computer Engineering 2001. Conference Proceedings (Cat. No.01TH8555).

[29] Michael M. Goodwin,et al. Adaptive Signal Models , 1998 .

[30] Eric Allamanche,et al. Content-based Identification of Audio Material Using MPEG-7 Low Level Description , 2001, ISMIR.

[31] Karthikeyan Umapathy,et al. Perceptual Coding of Audio Signals Using Adaptive Time-Frequency Transform , 2006, EURASIP J. Audio Speech Music. Process..

[32] Ioan Buciu,et al. Non-negative Matrix Factorization, A New Tool for Feature Extraction: Theory and Applications , 2008 .

[33] Sridhar Krishnan,et al. A Joint Time-Frequency and Matrix Decomposition Feature Extraction Methodology for Pathological Voice Classification , 2009, EURASIP J. Adv. Signal Process..

[34] E. Owens. Introduction to the Psychology of Hearing , 1977 .

[35] Marina Bosi,et al. ISO/IEC MPEG-2 Advanced Audio Coding: Overview and Applications , 1997 .

[36] Matthew Cooper,et al. Summarizing popular music via structural similarity analysis , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[37] Lie Lu,et al. Content analysis for audio classification and segmentation , 2002, IEEE Trans. Speech Audio Process..

[38] Michael W. Berry,et al. Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[39] Guodong Guo,et al. Content-based audio classification and retrieval by support vector machines , 2003, IEEE Trans. Neural Networks.

[40] Ioannis Pitas,et al. Recent advances in biometric person authentication , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[41] Sridhar Krishnan,et al. Discrete Polynomial Transform for Digital Imagewatermarking Application , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[42] George Tzanetakis,et al. Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[43] Leon Cohen,et al. Positive time-frequency distribution functions , 1985, IEEE Trans. Acoust. Speech Signal Process..

[44] S. Krishnan,et al. Quantification and localization of features in time-frequency plane , 2008, 2008 Canadian Conference on Electrical and Computer Engineering.

[45] Keinosuke Fukunaga,et al. Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[46] Yannis Stylianou,et al. Musical Genre Classification Using Nonnegative Matrix Factorization-Based Features , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[47] Queen Mary. MUSICAL AUDIO STREAM SEPARATION BY NON-NEGATIVE MATRIX FACTORIZATION , 2005 .

[48] P. Laguna,et al. Signal Processing , 2002, Yearbook of Medical Informatics.

[49] L. Cohen,et al. Time-frequency distributions-a review , 1989, Proc. IEEE.

[50] Ingrid Daubechies,et al. The wavelet transform, time-frequency localization and signal analysis , 1990, IEEE Trans. Inf. Theory.

[51] A. Zoubir,et al. EURASIP Journal on Advances in Signal Processing , 2011 .

[52] Rémi Gribonval,et al. Fast matching pursuit with a multiscale dictionary of Gaussian chirps , 2001, IEEE Trans. Signal Process..

[53] Sacha Krstulovic,et al. Mptk: Matching Pursuit Made Tractable , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[54] I. Paraskevas,et al. Audio classification using acoustic images for retrieval from multimedia databases , 2003, Proceedings EC-VIP-MC 2003. 4th EURASIP Conference focused on Video/Image Processing and Multimedia Communications (IEEE Cat. No.03EX667).

[55] C.-C. Jay Kuo,et al. Environmental sound recognition using MP-based features , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[56] Wen-Nung Lie,et al. Robust and high-quality time-domain audio watermarking subject to psychoacoustic masking , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[57] Changsheng Xu,et al. Automatic music classification and summarization , 2005, IEEE Transactions on Speech and Audio Processing.

[58] S. Krishnan,et al. A Robust Audio Watermark Representation Based on Linear Chirps , 2006, IEEE Transactions on Multimedia.

[59] Irena Orovic,et al. Robust Speech Watermarking Procedure in the Time-Frequency Domain , 2008, EURASIP J. Adv. Signal Process..

[60] Ernst Eberlein,et al. Second-Generation ISO/MPEG-Audio Layer III Coding , 1995 .

[61] S. Mallat. A wavelet tour of signal processing , 1998 .

[62] Rangaraj M. Rangayyan,et al. Feature identification in the time-frequency plane by using the Hough-Radon transform , 2001, Pattern Recognit..

[63] K. Raahemifar,et al. Audio watermarking time-frequency characteristics , 2003, Canadian Journal of Electrical and Computer Engineering.

[64] Stefan Meltzer,et al. MPEG-4 HE-AAc v2 - audio coding for today’s media world , 2005 .

[65] William J. Williams,et al. Improved time-frequency representation of multicomponent signals using exponential kernels , 1989, IEEE Trans. Acoust. Speech Signal Process..

[66] Ahmed H. Tewfik,et al. Current state of the art, challenges and future directions for audio watermarking , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[67] L. Cazzanti,et al. Automatic identification of sound recordings , 2004, IEEE Signal Processing Magazine.

[68] John C. Platt,et al. Distortion discriminant analysis for audio fingerprinting , 2003, IEEE Trans. Speech Audio Process..

[69] Ioan Buciu. A Letter From the Associate Executive Editor , 2008, Int. J. Comput. Commun. Control.

[70] Thierry Pun,et al. Second Generation Benchmarking and Application Oriented Evaluation , 2001, Information Hiding.

[71] Kaamran Raahemifar,et al. Content based audio classification and retrieval using joint time-frequency analysis , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[72] Jr. J.P. Campbell,et al. Speaker recognition: a tutorial , 1997, Proc. IEEE.

[73] Sophocles J. Orfanidis,et al. Introduction to signal processing , 1995 .

[74] Stéphane Mallat,et al. Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..