Spatial audio reproduction system for VR using 360 cameras

In order to maximise the immersion in VR environments, a plausible spatial audio reproduction synchronised with visual information is essential. In this work, we propose a pipeline to create plausible interactive audio from a pair of 360◦ cameras. 1 PROPOSED SYSTEM Fig. 1 shows the proposed pipeline to reconstruct an acoustic VR room from a pair of 360◦ camera images. A full surrounding scene is captured by a pair of vertically aligned 360◦ cameras at two different heights. They are aligned to the room coordinate axes by the Manhattan world alignment utilising the façade alignment techniques, and depth of the scene is estimated by dense correspondence matching between the aligned top and bottom images [3]. In addition, the top image is used for semantic scene segmentation and object recognition using SegNet [2]. Object-labelled cuboid structure is reconstructed from the depth and semantic segmentation results. Acoustic properties for the objects are assigned from the acoustic material list in the Google Resonance Audio package on Unity. Finally, the acoustic VR scene is rendered by setting sound source and player models on the VR platform. *This work was supported by the EPSRC Programme Grant S3A:Future Spatial Audio for an Immersive Listener Experience at Home (EP/L000539/1) and the BBC as part of the BBC Audio Research Partnership. Details about the data underlying this work, along with the terms for data access, are available from: http://dx.doi.org/10.15126/surreydata.00812228. Figure 2: Interactive VR rendering with spatial audio MR (Meas.) UL (Meas.) LR (Meas.) S1 (Meas.) MR (Syn.) UL (Syn.) LR (Syn.) S1 (Syn.) Time (s) Time (s) Time (s) Time (s) 0.25 0.50 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.00 0.25 0.50 0.25 0.50 1.00 2.00 4.00 8.00 0.00 0.25 0.50 1.00 2.00 4.00 8.00 F re q u en cy ( k H z) F re q u en cy ( k H z) 0

[1]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Adrian Hilton,et al.  Room Layout Estimation with Object and Material Attributes Information Using a Spherical Camera , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[3]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .