Automatic detection of disturbing robot voice- and ping pong-effects in GSM transmitted speech

This contribution reports about a method to automatically detect the disturbing Robot Voice and Ping Pong e ect which occur in GSM transmitted speech. Both e ects are caused by the frame substitution technique, recommended by the GSM standard: in these cases the transmitted speech may be modulated by a disturbing 50 Hz component. These modulations can be detected very easily in the frequency domain. By a framewise comparision of the modulation amplitude of an undisturbed clean speech signal with a test signal it is possible to locate the occurrence of Robot Voice and Ping Pong very precisely. Comparing human perception to the outcome of the proposed algorithm shows a high degree of correspondence.