论文信息 - Erratum to: New wireless connection between user and VE using speech processing

Erratum to: New wireless connection between user and VE using speech processing

This paper presents a novel speak-to-VR virtual-reality peripheral network (VRPN) server based on speech processing. The server uses a microphone array as a speech source and streams the results of the process through a Wi-Fi network. The proposed VRPN server provides a handy, portable and wireless human machine interface that can facilitate interaction in a variety interfaces and application domains including HMDand CAVE-based virtual reality systems, flight and driving simulators and many others. The VRPN server is based on a speech processing software development kits and VRPN library in C??. Speak-to-VR VRPN works well even in the presence of background noise or the voices of other users in the vicinity. The speech processing algorithm is not sensitive to the user’s accent because it is trained while it is operating. Speech recognition parameters are trained by hidden Markov model in real time. The advantages and disadvantages of the speak-to-VR server are studied under different configurations. Then, the efficiency and the precision of the speak-to-VR server for a real application are validated via a formal user study with ten participants. Two experimental test setups are implemented on a CAVE system by using either Kinect Xbox or array microphone as input device. Each participant is asked to navigate in a virtual environment and manipulate an object. The experimental data analysis shows promising results and motivates additional research opportunities.

James H. Oliver | M. Ali Mirzaei | Frédéric Mérienne | Jean-Rémy Chardonnet

[1] Robert S. Kennedy,et al. Simulator Sickness Questionnaire: An enhanced method for quantifying simulator sickness. , 1993 .

[2] Klaus Schulten,et al. Immersive Molecular Visualization and Interactive Modeling with Commodity Hardware , 2010, ISVC.

[3] S. Eddy. Hidden Markov models. , 1996, Current opinion in structural biology.

[4] Zuping Qian,et al. A New Partially Adaptive Minimum Variance Distortionless Response Beamformer with Constrained Stability Least Mean Squares Algorithm , 2013 .

[5] Paavo Alku,et al. Bandwidth Extension of Telephone Speech Using a Neural Network and a Filter Bank Implementation for Highband Mel Spectrum , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[6] Michael Rübsamen,et al. Robust Adaptive Beamforming Using Multidimensional Covariance Fitting , 2012, IEEE Transactions on Signal Processing.

[7] Russell M. Taylor,et al. VRPN: a device-independent, network-transparent VR peripheral system , 2001, VRST '01.

[8] Martin Fischbach,et al. SiXton's curse — Simulator X demonstration , 2011, 2011 IEEE Virtual Reality Conference.

[9] A. Srinivasan. Speech Recognition Using Hidden Markov Model , 2011 .

[10] Albert A. Rizzo,et al. FAAST: The Flexible Action and Articulated Skeleton Toolkit , 2011, 2011 IEEE Virtual Reality Conference.

[11] Fabian Vargas,et al. A FPGA-based Viterbi algorithm implementation for speech recognition systems , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[12] Patrik O'Brian Holt,et al. The Cognitive Effects of Delayed Visual Feedback: Working Memory Disruption While Driving in Virtual Environments , 2001, Cognitive Technology.

[13] Cheng Xia. An AR Tracker Based on Planar Marker , 2004 .