Online pairing of VoIP conversations

This paper answers the following question; given a multiplicity of evolving 1-way conversations, can a machine or an algorithm discern the conversational pairs in an online fashion, without understanding the content of the communications? Our analysis indicates that this is possible, and can be achieved just by exploiting the temporal dynamics inherent in a conversation. We also show that our findings are applicable for anonymous and encrypted conversations over VoIP networks. We achieve this by exploiting the aperiodic inter-departure time of VoIP packets, hence trivializing each VoIP stream into a binary time-series, indicating the voice activity of each stream. We propose effective techniques that progressively pair conversing parties with high accuracy and in a limited amount of time. Our findings are verified empirically on a dataset consisting of 1,000 conversations. We obtain very high pairing accuracy that reaches 97% after 5 min of voice conversations. Using a modeling approach we also demonstrate analytically that our result can be extended over an unlimited number of conversations.

[1]  Akira Takahashi Opinion model for estimating conversational quality of VoIP , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  BenyassineA.,et al.  ITU-T Recommendation G.729 Annex B , 1997 .

[3]  Rina Panigrahy,et al.  Better streaming algorithms for clustering problems , 2003, STOC '03.

[4]  Navendu Jain,et al.  Design, implementation, and evaluation of the linear road bnchmark on the stream processing core , 2006, SIGMOD Conference.

[5]  E. Schegloff,et al.  A simplest systematics for the organization of turn-taking for conversation , 1974 .

[6]  S. Feldstein,et al.  Markovian Model of Time Patterns of Speech , 1964, Science.

[7]  Tadayoshi Kohno,et al.  Devices That Tell on You: Privacy Trends in Consumer Ubiquitous Computing , 2007, USENIX Security Symposium.

[8]  Sudipto Guha,et al.  Clustering Data Streams: Theory and Practice , 2003, IEEE Trans. Knowl. Data Eng..

[9]  Deepak S. Turaga,et al.  QUANTization for Adapted GMM-Based Speaker Verification , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[10]  Sean P. Meyn,et al.  Relative entropy and exponential deviation bounds for general Markov chains , 2005, Proceedings. International Symposium on Information Theory, 2005. ISIT 2005..

[11]  Tuan Van Pham,et al.  Time-Frequency Analysis for Voice Activity Detection , 2006, SPPRA.

[12]  Tao Li,et al.  A general model for clustering binary data , 2005, KDD '05.

[13]  Carlos Ordonez,et al.  Clustering binary data streams with K-means , 2003, DMKD '03.

[14]  Philip S. Yu,et al.  Finding "Who Is Talking to Whom" in VoIP Networks via Progressive Stream Clustering , 2006, Sixth International Conference on Data Mining (ICDM'06).

[15]  Alexander Raake,et al.  The well-tempered conversation: interactivity, delay and perceptual VoIP quality , 2005, IEEE International Conference on Communications, 2005. ICC 2005. 2005.

[16]  Radu Sion,et al.  Rights protection for discrete numeric streams , 2006, IEEE Transactions on Knowledge and Data Engineering.

[17]  George Gabor,et al.  On the higher order distributions of speech signals , 1988, IEEE Trans. Acoust. Speech Signal Process..

[18]  Graham Cormode,et al.  Estimating Dominance Norms of Multiple Data Streams , 2003, ESA.

[19]  D. Torres-Roman,et al.  Traffic analysis for IP telephony , 2005, 2005 2nd International Conference on Electrical and Electronics Engineering.

[20]  Philip S. Yu,et al.  A Framework for Projected Clustering of High Dimensional Data Streams , 2004, VLDB.

[21]  Chris Clifton,et al.  When do data mining results violate privacy? , 2004, KDD.

[22]  Rajeev Motwani,et al.  Maintaining variance and k-medians over data stream windows , 2003, PODS.

[23]  H. Clark,et al.  Grounding in Communication', 127-149 in Resnick LB, Levine JM and Teasley SD , 1991 .

[24]  Jimeng Sun,et al.  Streaming Pattern Discovery in Multiple Time-Series , 2005, VLDB.

[25]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[26]  Dennis Shasha,et al.  StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time , 2002, VLDB.

[27]  E. Shlomot,et al.  ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications , 1997, IEEE Commun. Mag..

[28]  Christos Faloutsos,et al.  BRAID: stream mining through group lag correlations , 2005, SIGMOD '05.

[29]  Paul T. Brady,et al.  A statistical analysis of on-off patterns in 16 conversations , 1968 .

[30]  Joseph M. Hellerstein,et al.  Flux: an adaptive partitioning operator for continuous query systems , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[31]  Paul F. Syverson,et al.  Onion routing , 1999, CACM.

[32]  Jan Bosch,et al.  Object-oriented framework-based software development: problems and experiences , 2000, CSUR.

[33]  Ying Xing,et al.  Providing resiliency to load variations in distributed stream processing , 2006, VLDB.

[34]  S. Muthukrishnan,et al.  Estimating Entropy and Entropy Norm on Data Streams , 2006, Internet Math..

[35]  Aapo Hyvrinen,et al.  Fast and Robust Fixed-Point Algorithms , 1999 .

[36]  Stephanie D. Teasley,et al.  Perspectives on socially shared cognition , 1991 .

[37]  Andreas Spanias,et al.  Speech enhancement using the bispectrum , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[38]  Sudipto Guha,et al.  Streaming-data algorithms for high-quality clustering , 2002, Proceedings 18th International Conference on Data Engineering.

[39]  Herbert H. Clark,et al.  Grounding in communication , 1991, Perspectives on socially shared cognition.

[40]  Geoff Hulten,et al.  A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering , 2001, ICML.

[41]  Chong Un,et al.  Voiced/Unvoiced/Silence discrimination of speech by delta modulation , 1980 .

[42]  Giuseppe Ruggeri,et al.  Performance evaluation and comparison of ITU-T/ETSI voice activity detectors , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[43]  Azer Bestavros Load profiling: a methodology for scheduling real-time tasks in a distributed system , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[44]  Paul Syverson,et al.  Onion Routing for Anonymous and Private Internet Connections , 1999 .

[45]  Sumit Basu,et al.  Conversational scene analysis , 2002 .

[46]  S. Casale,et al.  Performance evaluation and comparison of G.729/AMR/fuzzy voice activity detectors , 2002, IEEE Signal Processing Letters.

[47]  Chris Clifton,et al.  Privacy-preserving data integration and sharing , 2004, DMKD '04.

[48]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[49]  Jimeng Sun,et al.  Distributed Pattern Discovery in Multiple Streams , 2006, PAKDD.

[50]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[51]  Charu C. Aggarwal,et al.  Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, DMKD 2003, San Diego, California, USA, June 13, 2003 , 2003, DMKD.

[52]  Songnian Zhou Performance Studies of Dynamic Load Balancing in Distributed Systems , 1987 .

[53]  Ying Xing,et al.  Dynamic load distribution in the Borealis stream processor , 2005, 21st International Conference on Data Engineering (ICDE'05).

[54]  Dejan S. Milojicic,et al.  Process migration , 1999, ACM Comput. Surv..

[55]  Jirí Navrátil,et al.  The IBM system for the NIST-2002 cellular speaker verification evaluation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[56]  Mats Näslund,et al.  The Secure Real-time Transport Protocol (SRTP) , 2004, RFC.

[57]  Reinhold Orglmeister,et al.  Blind source separation of real world signals , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[58]  Michael L. Scott,et al.  Scheduler-conscious synchronization , 1997, TOCS.

[59]  Peter Kabal,et al.  Classified comfort noise generation for efficient voice transmission , 2006, INTERSPEECH.

[60]  Sushil Jajodia,et al.  Tracking anonymous peer-to-peer VoIP calls on the internet , 2005, CCS '05.

[61]  R. Venkatesha Prasad,et al.  Comparison of voice activity detection algorithms for VoIP , 2002, Proceedings ISCC 2002 Seventh International Symposium on Computers and Communications.

[62]  Daniel Minoli,et al.  Issues in packet voice communication , 1979 .