Writing and written language play today an increasingly important part in many people’s lives. Written language has become more or less a prerequisite for daily communication. This development of society leads to increased needs for tools that can help humans in dealing with text. A technology that has a potential to aid people with writing and written language is language technology. In this thesis, the focus is on language tools based on language technology that can aid writers and learners of Swedish. A language tool that has been developed and evaluated in the thesis is the grammar checker Granska. The thesis work on Granska includes the design of its rule language, and the development of grammar checking rules for common error types in Swedish. In addition, rules for phrase analysis and clause boundary detection have been developed constituting a partial and shallow parser called GTA. Language tools for writing can mainly be evaluated in two ways: with focus on text or with focus on the writer. In this thesis, both types of evaluations have been carried out both with native writers and second language writers. The first textual evaluation of Granska showed that the genre has a strong influence on the result. In a second evaluation, Granska was compared with a commercial grammar checker on second language writers’ texts. Granska found more errors, but with a lower precision. A third evaluation focused on the general text analyzers, which Granska relies on, in this case a statistical word class analyzer and the parser GTA. These programs were evaluated on texts where spelling errors were introduced, in order to test the programs’ robustness. Results showed that as long as the word class analyzer is robust the parser GTA would also be robust. In a first formative user study with Granska and five participants, results suggested that several and competing error diagnoses and correction proposals are not a problem for the users as long as there exist at least one accurate correction proposal. Moreover, false alarms from the spelling checker seemed to pose a limited problem for the users, but false alarms on more complicated error types might disturb the revision process of the users. In order to improve the design of language tools for second language writers a field study was carried out at a Swedish university. Sixteen students with different linguistic and cultural backgrounds participated in the study. The objective was to study the use of Granska in students’ free writing. The results indicated that although most alarms from Granska are accurate, lack of feedback and misleading feedback are problems for second language writers. The results also suggested that providing the students with feedback on different aspects of their interlanguage, not only errors, and facilitating the processes of language exploration and reflection are important processes to be supported in second-language learning environments. These insights were used as design principles in the design and development of an interactive language environment called Grim. This program includes a basic word processor, in which the user can get feedback on linguistic code features from different language tools such as Granska and GTA. In addition, other tools are available for the user to explore language use in authentic texts and to achieve lexical comprehension through bilingual dictionaries.
[1]
Sabine Buchholz,et al.
Introduction to the CoNLL-2000 Shared Task Chunking
,
2000,
CoNLL/LLL.
[2]
Thorsten Brants,et al.
TnT – A Statistical Part-of-Speech Tagger
,
2000,
ANLP.
[3]
Björn Gambäck.
Processing Swedish sentences : a unification-based grammar and some applications
,
1997
.
[4]
Fred J. Damerau,et al.
A technique for computer detection and correction of spelling errors
,
1964,
CACM.
[5]
Eva I. Ejerhed,et al.
Finite state segmentation of discourse into clauses
,
1996,
Natural Language Engineering.
[6]
Eric Brill,et al.
A Simple Rule-Based Part of Speech Tagger
,
1992,
HLT.
[7]
Dan Roth,et al.
Exploring evidence for shallow parsing
,
2001,
CoNLL.
[8]
Wolfgang Menzel,et al.
Robust Processing of Natural Language
,
1995,
KI.
[9]
Atro Voutilainen.
Parsing Swedish
,
2001,
NODALIDA.
[10]
Grace Ngai,et al.
Transformation Based Learning in the Fast Lane
,
2001,
NAACL.
[11]
Ola Knutsson,et al.
A Robust Shallow Parser for Swedish
,
2003
.
[12]
Beáta Megyesi,et al.
Shallow Parsing with PoS Taggers and Linguistic Features
,
2002,
J. Mach. Learn. Res..
[13]
Ted Briscoe,et al.
Parser evaluation: a survey and a new proposal
,
1998,
LREC.
[14]
Mitchell P. Marcus,et al.
Text Chunking using Transformation-Based Learning
,
1995,
VLC@ACL.
[15]
Joakim Nivre,et al.
What kinds of trees grow in Swedish soil
,
2002
.
[16]
Gunnar Eriksson,et al.
The Linguistic Annotation System of the Stockholm - Umea
,
1993,
EACL.