We present ETCHA Sketches‐an Experimental Test Corpus of Hand Annotated Sketches‐with the goal of facilitating the development of a standard test corpus for sketch understanding research. To date we have collected sketches from four domains: circuit diagrams, family trees, floor plans and geometric configurations. We have also labeled many of the strokes in these data sets with geometric primitive labels (e.g., line, arc, polyline, polygon, and ellipse). We found accurate labeling of data to be a more complex task than may be anticipated. The complexity arises because labeled data can be used for different purposes with different requirements, and because some strokes are ambiguous and can legitimately be put into multiple categories. We discuss several different labeling methods and some properties of the sketches that became apparent from the process of collecting and labeling the data. The data sets are available online at http://rationale.csail.mit.edu/ETCHASketches.
[1]
Barbara Tversky,et al.
What do Sketches Say about Thinking
,
2002
.
[2]
L. Ball,et al.
Structure in idea sketching behaviour
,
1998
.
[3]
Treebank Penn,et al.
Linguistic Data Consortium
,
1999
.
[4]
Jason Hong,et al.
Sketch Recognizers from the End-User's, the Designer's, and the Programmer's Perspective
,
2002
.
[5]
John S. Gero,et al.
The structure of concurrent cognitive actions: a case study on novice and expert designers
,
2002
.
[6]
Jean Carletta,et al.
Assessing Agreement on Classification Tasks: The Kappa Statistic
,
1996,
CL.