On Computing Canonical Subsets of Graph-Based Behavioral Representations

The collection of behavior protocols is a common practice in human factors research, but the analysis of these large data sets has always been a tedious and time-consuming process. We are interested in automatically finding canonical behaviors : a small subset of behavioral protocols that is most representative of the full data set, providing a view of the data with as few protocols as possible. Behavior protocols often have a natural graph-based representation, yet there has been little work applying graph theory to their study. In this paper we extend our recent algorithm by taking into account the graph topology induced by the paths taken through the space of possible behaviors. We applied this technique to find canonical web-browsing behaviors for computer users. By comparing identified canonical sets to a ground truth determined by expert human coders, we found that this graph-based metric outperforms our previous metric based on edit distance.

[1]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[2]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[3]  Kaleem Siddiqi,et al.  Matching Hierarchical Structures Using Association Graphs , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Frank E. Ritter,et al.  Developing Process Models as Summaries of HCI Action Sequences , 1994, Hum. Comput. Interact..

[5]  K. A. Ericsson,et al.  Protocol Analysis: Verbal Reports as Data , 1984 .

[6]  Ali Shokoufandeh,et al.  Stable Bounded Canonical Sets and Image Matching , 2005, EMMCVPR.

[7]  T. Motzkin,et al.  Maxima for Graphs and a New Proof of a Theorem of Turán , 1965, Canadian Journal of Mathematics.

[8]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[9]  Kim-Chuan Toh,et al.  SDPT3 -- A Matlab Software Package for Semidefinite Programming , 1996 .

[10]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .

[11]  Dana K. Smith,et al.  Automated Protocol Analysis , 1993, Hum. Comput. Interact..

[12]  Ali Shokoufandeh,et al.  Finding canonical behaviors in user protocols , 2009, CHI.

[13]  Ali Shokoufandeh,et al.  Canonical subsets of image features , 2008, Comput. Vis. Image Underst..

[14]  Edward Cutrell,et al.  What are you looking for?: an eye-tracking study of information usage in web search , 2007, CHI.

[15]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[16]  Dario D. Salvucci Modeling Driver Behavior in a Cognitive Architecture , 2006, Hum. Factors.

[17]  Roy O. Freedle,et al.  Artificial Intelligence and the Future of Testing , 1990 .

[18]  Julie Chen,et al.  The bloodhound project: automating discovery of web usability issues using the InfoScentπ simulator , 2003, CHI '03.

[19]  Ali Shokoufandeh,et al.  On Computing the Canonical Features of Software Systems , 2006, 2006 13th Working Conference on Reverse Engineering.

[20]  Allen Newell,et al.  The psychology of human-computer interaction , 1983 .

[21]  John R. Anderson,et al.  Automated Eye-Movement Protocol Analysis , 2001, Hum. Comput. Interact..