Automated analysis of oral corpora is still in its infancy. Interest is growing, but tools are still scarce. This article presents processing tools that we have developed to analyze corpora of spontaneous oral speech in Acadian French. This variety of French spoken in the Maritime Provinces of Canada has three levels of characteristics: oral, regional, and mixed language traits. The challenge was to adapt an existing processing tool, INTEXlNooJ, to find solutions to the problems presented by our corpora. We will present three different solutions developed with NooJ: (1) the configuration of dictionary entries that allows users to relate the orthographic and lexical representations of a word coming from standard French, traditional Acadian, English, or the vernacular; (2) grammars developed to process morphological characteristics of nominal and verbal inflections; and (3) a disambiguation graph for a, which is the 3SG pronoun in Acadian French as well as the 3SG.PRES of the auxiliary avoir.
[1]
Geneviève Geron,et al.
La banque de données VALIBEL : des ressources textuelles orales pour l’étude du français en Wallonie et à Bruxelles.
,
2002
.
[2]
C. Blanche-Benveniste.
Approches de la langue parlée en français
,
2000
.
[3]
B. Habert,et al.
Les linguistiques de corpus
,
1997
.
[4]
G. Sankoff,et al.
Méthodes d’échantillonnage et utilisation de l’ordinateur dans l’étude de la variation grammaticale
,
1976
.
[5]
P. Martel,et al.
Dictionnaire de fréquence des mots du français parlé au Québec : fréquence, dispersion, usage, écart réduit
,
1992
.
[6]
F. Gadet,et al.
Le français populaire
,
1992
.
[7]
Max Silberztein.
Analyse automatique de corpus avec INTEX
,
1996
.
[8]
Maurice Gross,et al.
Méthodes en syntaxe : régime des constructions complétives
,
1978
.