Using topes to validate and reformat data in end-user programming tools

End-user programming tools offer no data types except "string" for many categories of data, such as person names and street addresses. Consequently, these tools cannot automatically validate or reformat these data. To address this problem, we have developed a user-extensible model for string-like data. Each "tope" in this model is a user-defined abstraction that guides the interpretation of strings as a particular kind of data. Specifically, each tope implementation contains software functions for recognizing and reformatting instances of that tope's kind of data. This makes it possible at runtime to distinguish between invalid data, valid data, and questionable data that could be valid or invalid. Once identified, questionable and/or invalid data can be double-checked and possibly corrected, thereby increasing the overall reliability of the data. Valid data can be automatically reformatted to any of the formats appropriate for that kind of data. To show the general applicability of topes, we describe new features that topes have enabled us to provide in four tools.

[1]  Benjamin C. Pierce,et al.  Types and programming languages: the next generation , 2003, 18th Annual IEEE Symposium of Logic in Computer Science, 2003. Proceedings..

[2]  Brad A. Myers,et al.  Scenario-Based Requirements for Web Macro Tools , 2007 .

[3]  Eser Kandogan,et al.  Koala: capture, share, automate, personalize business processes on the web , 2007, CHI.

[4]  Mary Shaw,et al.  Estimating the numbers of end users and end user programmers , 2005, 2005 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC'05).

[5]  Benjamin C. Pierce,et al.  Types and programming languages / Benjamin C. Pierce , 2002 .

[6]  Gregg Rothermel,et al.  End-user software engineering with assertions in the spreadsheet paradigm , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[7]  Gregg Rothermel,et al.  WYSIWYT testing in the spreadsheet paradigm: an empirical evaluation , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[8]  Maria Jean Johnstone Hall A risk and control-oriented study of the practices of spreadsheet application developers , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[9]  Mary Shaw,et al.  Trial By Water: Creating Hurricane Katrina "Person Locator" Web Sites , 2006 .

[10]  Mary Shaw,et al.  Toped: enabling end-user programmers to validate data , 2008, CHI Extended Abstracts.

[11]  David Walker,et al.  Dynamic Typing with Dependent Types , 2004, IFIP TCS.

[12]  Raymond R. Panko,et al.  What we know about spreadsheet errors , 1998 .

[13]  Mary Shaw,et al.  Trial By Water: Creating Hurricane Katrina "Person Locator" Web Sites , 2007 .

[14]  Gregg Rothermel,et al.  The EUSES spreadsheet corpus: a shared resource for supporting experimentation with spreadsheet dependability mechanisms , 2005, ACM SIGSOFT Softw. Eng. Notes.

[15]  Bonnie A. Nardi,et al.  Collaborative, programmable intelligent agents , 1998, CACM.

[16]  Mary Shaw,et al.  Topes , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[17]  Rob Miller,et al.  Outlier finding: focusing user attention on possible errors , 2001, UIST '01.

[18]  Mary Shaw,et al.  Accommodating data heterogeneity in ULS systems , 2008, ULSSIS '08.

[19]  Alan F. Blackwell,et al.  SWYN: a visual representation for regular expressions , 2001 .

[20]  Christopher Scaffidi Unsupervised Inference of Data Formats in Human-Readable Notation , 2007, ICEIS.