Building a personalized audio equalizer interface with transfer learning and active learning

Potential users of audio production software, such as audio equalizers, may be discouraged by the complexity of the interface and a lack of clear affordances in typical interfaces. In this work, we create a personalized on-screen slider that lets the user manipulate the audio with an equalizer in terms of a descriptive term (e.g. "warm"). The system learns mappings by presenting a sequence of sounds to the user and correlating the gain in each frequency band with the user's preference rating. This method is extended and improved on by incorporating knowledge from a database of prior concepts taught to the system by prior users. This is done with a combination of active learning and simple transfer learning. Results on a study of 35 participants show personalized audio manipulation tool can be built with 10 times fewer interactions than is possible with the baseline approach.

[1]  David Cohn,et al.  Active Learning , 2010, Encyclopedia of Machine Learning.

[2]  Desney S. Tan,et al.  CueFlik: interactive concept learning in image search , 2008, CHI.

[3]  F K Kuk,et al.  The reliability of a modified simplex procedure in hearing aid frequency-response selection. , 1992, Journal of speech and hearing research.

[4]  Bryan Pardo,et al.  Towards Speeding Audio EQ Interface Building with Transfer Learning , 2012, NIME.

[5]  Bryan Pardo,et al.  Weighted-Function-Based Rapid Mapping of Descriptors to Audio Processing Parameters , 2011 .

[6]  Dale Reed Capturing perceptual expertise: a sound equalization expert system , 2001, Knowl. Based Syst..

[7]  Taghi M. Khoshgoftaar,et al.  A Survey of Collaborative Filtering Techniques , 2009, Adv. Artif. Intell..

[8]  James A. Landay,et al.  Damask: A Tool for Early-Stage Design and Prototyping of Multi-Device User Interfaces , 2002 .

[9]  Krzysztof Z. Gajos,et al.  SUPPLE: automatically generating user interfaces , 2004, IUI '04.

[10]  Chris Callison-Burch,et al.  Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation , 2010, ACL.

[11]  Perry R. Cook,et al.  A Meta-Instrument for Interactive, On-the-Fly Machine Learning , 2009, NIME.

[12]  Gregory H. Wakefield,et al.  Efficient perceptual tuning of hearing aids with genetic algorithms , 2004, IEEE Transactions on Speech and Audio Processing.

[13]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[14]  Dan Morris,et al.  MySong: automatic accompaniment generation for vocal melodies , 2008, CHI.

[15]  Dan Morris,et al.  Dynamic mapping of physical controls for tabletop groupware , 2009, CHI.

[16]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[17]  H Levitt,et al.  An evaluation of three adaptive hearing aid selection strategies. , 1987, The Journal of the Acoustical Society of America.

[18]  Dan Morris,et al.  Computational creativity support: using algorithms and machine learning to help people be more creative , 2009, CHI Extended Abstracts.

[19]  Jeffrey Nichols,et al.  Generating remote control interfaces for complex appliances , 2002, UIST '02.

[20]  Jörn Loviscach,et al.  subjEQt: controlling an equalizer through subjective terms , 2006, CHI EA '06.

[21]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[22]  Gregory H Wakefield,et al.  Genetic Algorithms for Adaptive Psychophysical Procedures: Recipient-Directed Design of Speech-Processor MAPs , 2005, Ear and hearing.