A processing method and device for voice stream

The invention discloses a voice stream processing method, including the acquirement of two voice streams at least. The time sequences of voice data packets in the voice streams are adjusted and the voice compressed frames of the adjusted voice streams corresponding to the voice data packets are outputted synchronously, then the compressed voice frames which are synchronously outputted are restored into original voice frames. In the end, the original voice frames of the voice streams corresponding to the same time are merged. The invention also discloses a voice stream processing device. By the invention, the two voice steams at least can be processed by merging, thereby realizing the monitor of a dialogue between two parties and even among a plurality of parties, and enhancing the service quality.