System and method for integrating gesture and voice

A system and a method for integrating recognition of gesture and voice are provided to enhance recognition rate even in gesture with unclear separability by detecting a gesture command section on the basis of an EPD(End Point Detection) value which is a start point of a voice command section. A system for integrating recognition of gesture and voice includes a voice feature extracting unit(210), a gesture feature extracting unit(220), a synchronizing module(230) and an integration recognizing unit(240). The voice feature extracting unit detects a start point and an end point of a command from voice inputted via a mike and extracts voice feature information. The gesture feature extracting unit extracts a command section from gesture in an image captured by a camera by using the start point and the end point detected by the voice feature extracting unit, and extracts gesture feature information. The synchronizing module detects a start point of the gesture from the captured image by using the start point detected by the voice feature extracting unit, and calculates optimal image frames by applying a preset optimal frame number to the detected start point. The integration recognizing unit outputs the extracted voice feature information and gesture feature information with integrated recognition data by using preset learning parameters.