Talking gestures are a fundamental part of body language and, therefore, are also important for social robots. Gesture generation by generative approaches is supposed to produce a more appropriate behavior than rule-based approaches. Usually, the evaluation of generated gestures is carried out by subjective visual evaluation, which could be cultural dependent and influenced by external factors. In this work we extend previous research on quantitative evaluation methods, comparing two generative methods and showing that their results correlate with subjective evaluation by a sizable group of people. The final goal is to offer a quantitative tool to help the researchers to automate the evaluation of their gesture generation systems, as a complementary measure to subjective methods.