Usability of XML Query Languages

The eXtensible Markup Language (XML) is a markup language which enables re-use of information. Specific query languages for XML are developed to facilitate this. There are large differences between history, design goal, and syntax of the XML query languages. However, in practice these languages are used for similar purposes. The motivation for a particular design decision is often not clear. Also it is not clear if a particular design choice influences the user's effectiveness with a query language. In this study the usability of most important query language candidates are investigated. Usability is: the performance and the satisfaction of users measured in relation to a set of tasks. Not much is known about the usability of formal languages. In chapter 2 an overview is provided of former research of the usability of database query languages. The methodology and used models are discussed. The most important assumption in this study is that differences between the syntax of query languages lead to differences between usability of these languages. In chapter 2 the background of this assumption is discussed. In the XML community there have been fierce debates about the assumed influence of syntax on the usability of a language. A common argument in these discussions is, for instance, that XSLT is difficult to use because of the verboseness of the language. XQuery is argued to be more suitable for users with a database background because of the compact syntax that is partly based on popular database query languages. Performance differences between users of XML query languages must be related to how users formulate queries with these languages. If a user needs more time or has more difficulties when formulating a query with one language compared to another language, these languages must differ in how queries are solved. In chapter 3 a model is presented how users formulated queries with a query language. This model is used to interpret behavior of subjects participating in a thinking aloud experiment. This experiment provides a first impression of the usability of three XML query languages (XSLT, XQuery, and SQL/XML) and the possible reasons for these differences. The results indicate that the model is a suitable representation of the query solving process. This experiment is discussed in chapter 4. The differences between query language usability are investigated further in a second experiment. A large group of subjects and a set of five query tasks show that users perform better and are more satisfied with XQuery compared to XSLT. This experiment is discussed in chapter 5. A third experiment was conducted to explain the causes of the difference in usability between XSLT and XQuery. In advance the following possible reasons for performance differences are mentioned: the complexity of query tasks, the structure of XML documents, the verboseness of language expressions, the different methods of embedding Xpath in XSLT and XQuery, user experience, and the number of expressions that are necessary for a particular query task. These factors are quantified for a large set of query tasks and the influence of these factors is determined on the performance with a query language. This experiment is discussed in chapter 6. This study shows that the usability of XQuery is higher compared to XSLT. XQuery is easier to use because the expressions in this language are more compact and because XQuery embeds Xpath in a more simple way compared to XSLT. Furthermore, query complexity is a good predictor for user performance on a query task. Finally, user experience has a large influence on the performance with a query language. The influence of different XML structures on performance with a query language was not shown in this study. The query tasks and the subjects in the experiments are representative for the start of the learning curve of these languages. The conclusions of this study are mentioned in chapter 7.

[1]  Ben Shneiderman,et al.  Two experimental comparisons of relational and hierarchical database models , 1978 .

[2]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[3]  Kevin F. Bury,et al.  Two Studies Evaluating Learning and Use of QBE and SQL , 1983 .

[4]  Phyllis Reisner,et al.  Use of Psychological Experimentation as an Aid to Development of a Query Language , 1977, IEEE Transactions on Software Engineering.

[5]  Kevin F. Bury,et al.  An On-Line Experimental Comparison of two Simulated Record Selection Languages , 1982 .

[6]  Tiziana Catarci,et al.  Are Visual Query Languages Easier to Use than Traditional Ones? An Experimental Proof , 1996, BCS HCI.

[7]  Ted Boren,et al.  Thinking aloud: reconciling theory and practice , 2000 .

[8]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[9]  Louis M. Gomez,et al.  A Cognitive Analysis of Database Query Production , 1986 .

[10]  C.M.T. Metselaar,et al.  Sociaal-organisatorische gevolgen van kennistechnologie : een procesbenadering en actorperspectief , 2000 .

[11]  T. Landauer,et al.  Handbook of Human-Computer Interaction , 1997 .

[12]  J.P.M. Graaumans XML query requirements , 2002 .

[13]  Ioana Manolescu,et al.  XMark: A Benchmark for XML Data Management , 2002, VLDB.

[14]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[15]  John C. Thomas,et al.  A psychological study of query by example , 1975, AFIPS '75.

[16]  Daniela Florescu,et al.  Quilt: An XML Query Language for Heterogeneous Data Sources , 2000, WebDB.

[17]  P. H. Lindsay,et al.  Human Information Processing: An Introduction to Psychology , 1972 .

[18]  Matthias Jarke,et al.  A framework for choosing a database query language , 1985, CSUR.

[19]  Lisanne Bainbridge,et al.  Verbal protocol analysis. , 1990 .

[20]  M. Tamer Özsu,et al.  XBench - A Family of Benchmarks for XML DBMSs , 2002, EEXTT.

[21]  Jakob Nielsen,et al.  Usability engineering , 1997, The Computer Science and Engineering Handbook.

[22]  Rodney N. Cuff On casual users , 1980 .

[23]  Frederick H. Lochovsky,et al.  User performance considerations in DBMS selection , 1977, SIGMOD '77.