A Speech Recognition System based on New Endpoint Estimation Method jointly using Audio/Video Informations