For a number of language processing tasks, such as informati on retrieval and information extraction tasks, pertinent information can be e xtracted from text without doing a full parse of the individual sentences. The most c mmon restriction of the parser is to adopt a non-recursive model of the languag e treated, which allows an implementation of the parser using efficient finite st ate tools at the cost of missing some coverage. These light parsers allow the succ e sive introduction of symbols into the input string wherever specified regular e xpr ssions of words and/or part-of-speech tags match. Recent advances in finite state expression compilation make writing mark up transducers simpler, leading to quicker implementations of layered finite state parsers. The resulting parse rs ar easier to create and maintain. In this article, we describe a light parsing metho d using recently created finite state operators. Two applications of this parser are d escribed: grouping adjacent syntactically related units, and extracting non-ad jacent n-ary grammatical relations. A system for evaluating the parser over a large co rpus is described.
[1]
Jean-Pierre Chanod,et al.
Finite state based reductionist parsing for French
,
1999
.
[2]
Gregory Grefenstette,et al.
Use of syntactic context to produce term association lists for text retrieval
,
1992,
SIGIR '92.
[3]
Douglas E. Appelt,et al.
FASTUS: A Finite-state Processor for Information Extraction from Real-world Text
,
1993,
IJCAI.
[4]
Lauri Karttunen.
Directed Replacement
,
1996,
ACL.
[5]
Steven Abney,et al.
Parsing By Chunks
,
1991
.
[6]
Mehryar Mohri,et al.
Finite-State Transducers in Language and Speech Processing
,
1997,
CL.
[7]
Kenneth Ward Church.
A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text
,
1988,
ANLP.