Electronic databases are increasingly popular tools in typological research. Despite the advantages of such tools, there are problems connected both with their construction and with their standardization. For instance, there is generally a considerable gap between the information stored in typological databases and primary data: primary morphosyntactic data are much more difficult to handle computationally than typological generalizations. Moreover, the need for standardization has led typologists to develop highly refined glossing practices and guidelines for collecting data, but there are still too few initiatives to increase standardization in typological databases. The aim of this paper is to suggest a radically new approach to the storage of data for typological analysis. The Med-Typ Database, which is currently being developed at the University of Pavia, has been providing us with concrete experience of the problems that need to be addressed when creating typological databases. This database uses XML annotation and aims to be both a collection of data for future analyses of areal distribution of features within the Mediterranean area and a tool for systematic analysis of the range of variation found in various typological domains. Introduction: the MED-TYP project The aim of this paper is twofold, and can be summarized as follows: a) firstly, we aim to describe the on-going experience of the Med(iterranean)-Typ(ology) Database, which is currently being developed at the University of Pavia. The data included in the Med-Typ Database have been providing us with concrete experience of the problems that need to be addressed when creating typological databases; b) secondly, we aim to suggest a radically new approach to the storage of data for typological analysis, and to discuss the advantages of such an approach. The Med-Typ project was launched in 1997 and concluded in 2000. Its basic assumption was that some structural features of Mediterranean languages have been significantly influenced by the fact that these languages have been in contact for several centuries. The major aim of the project was to outline a typology of Mediterranean languages, and to describe the distribution of various structural traits within this area so as to uncover possible phenomena of areal convergence. To do so, it has been necessary to plot the features of Mediterranean languages against the universal tendencies ascertained in the world’s languages with respect to the phenomena taken into account: this made it possible to distinguish true areal features of the Mediterranean area, derived from language contact, from the results of universal typological tendencies. The research was based on a language sample including both languages in the Mediterranean area (Catalan, Span1 The project (extended title: “Languages in the Mediterranean area: typology and convergence”) has been financed by the Italian National Research Council (CNR). Researchers from the following universities took part in it: Universita di Pavia, Universita di Pisa, Universita per Stranieri di Perugia, Universita per Stranieri di Siena, Universita di Trieste, Universita della Tuscia (Viterbo). The reader is referred to Cristofaro & Putzu (2000), Ramat & Stolz (2002), Ramat (2003), and Stolz & Sanso (forthcoming). ish, French, Provencal, Italian, Sardinian, Friulan, Slovene, Serbo-Croatian, Albanian, Modern Greek, Turkish, Maltese, Modern Hebrew, Modern Standard Arabic, Arabic dialects, Berber) and languages that do not belong, strictly speaking, to the Mediterranean area, but are historically connected to it (Portuguese, Basque, Macedonian, Bulgarian, Romanian). The analysis was based mainly on synchronic data, but diachronic investigation was not excluded. The selected research topics were: (i) the expression of possession (ii) relative clause formation (iii) subordination strategies (iv) noun phrases and pronominal clitics (v) volitional constructions (vi) spatial deixis (vii) indefinite and negative quantification (viii) evaluative morphology (ix) intensifiers and reflexives (x) yes-no questions (xi) converbs The results of this project can be summarized as follows: a) if linguistic area is to be intended as a group of languages sharing a significant number of features by virtue of contiguity, there is no Mediterranean area as such; b) much in the spirit of Dahl (2001), however, the areal dimension in the study of Mediterranean languages has revealed a number of unexpected contact phenomena which are significant irrespective of whether they can be described in terms of linguistic areas in the traditional sense. Thus, “area” has turned out to be a significant notion when examining the distribution of typological features in Mediterranean languages, in comparison to those of neighboring European languages and, more generally, to universal typological tendencies concerning the phenomena taken into account. The creation of an electronic database of linguistic phenomena in the Mediterranean area was not among the aims of the Med-Typ project. A new three-year project on
[1]
James Clark,et al.
XSL Transformations (XSLT) Version 1.0
,
1999
.
[2]
Martin Haspelmath,et al.
Principles of areal typology
,
2001
.
[3]
Martin Haspelmath,et al.
Language typology and language universals : an international handbook
,
2001
.
[4]
C. Lehmann.
Language documentation: a program
,
2001
.
[5]
Stavros Skopeteas,et al.
Interlinear morphemic glossing
,
2004
.
[6]
Rob Goedemans,et al.
A unified system for accessing typological databases
,
2002,
LREC.
[7]
Dunstan Brown,et al.
A typological database of agreement
,
2002,
LREC.
[8]
C. Lehmann.
Data in linguistics
,
2004
.
[9]
Joachim Griese,et al.
Berlin (Freie Universität)
,
1981
.
[10]
Alexis. Dimitriadis,et al.
Integrating different data types in a Typological Database System
,
2002
.