Random forests-based 2D-to-3D video conversion

An efficient 2D-to-3D video conversion method using Random Forests (RF) machine learning algorithm is proposed. Our approach incorporates multiple monocular cues based on the characteristics of the 3D human visual depth perception (such as texture variation, motion parallax, haze, perspective, occlusion, sharpness and vertical coordination of image pixels) in order to model the depth map of the recorded scene. Performance evaluations show that our RF-based approach outperforms a state-of-the-art motion parallax-based technique by providing more realistic depth information for the scene. Moreover the subjective comparison of results (obtained by viewers watching the generated stereo video sequences on a 3D display system) confirms the higher 3D picture quality obtained by our RF-based method.