MVSS: Michigan Visual Sonification System

Visual Sonification is the process of converting visual properties of objects into sound signals. This paper describes the Michigan Visual Sonification System (MVSS) that utilizes this process to assist the visually impaired in distinguishing different objects in their surroundings. MVSS uses depth information to first segment and localize salient objects and then represents an object's appearance using histograms of visual features. A dictionary of invariant visual features (or words) is created in an a-priori off-line learning phase using Bag-of-Words modeling. The histogram of a segmented object is then converted to a sound signal, the volume and 3D placement of which is determined by the relative position of the object with respect to the user. The system then relies on the considerable discriminating power of the human brain to localize and “classify” the sound, thus enabling the user to distinguish between visually distinct object classes. This paper describes the different components of MVSS in detail and presents some promising initial experimental results.

[1]  Joseph F. Rizzo,et al.  Development and Implantation of a Minimally Invasive Wireless Subretinal Neurostimulator , 2009, IEEE Transactions on Biomedical Engineering.

[2]  Frank Dellaert,et al.  SWAN: System for Wearable Audio Navigation , 2007, 2007 11th IEEE International Symposium on Wearable Computers.

[3]  Peter B. L. Meijer,et al.  An experimental system for auditory image representations , 1992, IEEE Transactions on Biomedical Engineering.

[4]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[6]  Hans Limburg,et al.  Global initiative for the elimination of avoidable blindness , 2012 .

[7]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[8]  Silvio Savarese,et al.  3D generic object categorization, localization and pose estimation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).