Clause Boundary Identification using Classifier and Clause Markers in Urdu Language

This paper presents the identification of clause boundary for the Urdu language. We have used Conditional Random Field as the classification method and the clause markers. The clause markers play the role to detect the type of subordinate clause, which is with or within the main clause. If there is any misclassification after testing with different sentences then more rules are identified to get high recall and precision. Obtained results show that this approach efficiently determines the type of sub-ordinate clause and its boundary.

[1]  Eva I. Ejerhed,et al.  Finding Clauses in Unrestricted Text by Finitary and Stochastic Methods , 1988, ANLP.

[2]  V. J. Leffa Clause Processing in Complex Sentences , 2008 .

[3]  Sivaji Bandyopadhyay,et al.  Clause Identification and Classification in Bengali , 2010 .

[4]  Daniel Kelly,et al.  Evaluation of threshold model HMMS and Conditional Random Fields for recognition of spatiotemporal gestures in sign language , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[5]  Ratna Sanyal,et al.  HMM-based Language-independent POS Tagger , 2007, IICAI.

[6]  Harris Papageorgiou Clause recognition in the framework of alignment , 1997 .

[7]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[8]  Hervé Déjean,et al.  Introduction to the CoNLL-2001 shared task: clause identification , 2001, CoNLL.

[9]  Seong-Bae Park,et al.  A Feature Space Expression to Analyze Dependency of Korean Clauses with a Composite Kernel , 2007, Sixth International Conference on Advanced Language Processing and Web Information Technology (ALPIT 2007).

[10]  Sobha Lalitha Devi,et al.  Clause Boundary Identification Using Conditional Random Fields , 2008, CICLing.

[11]  Miriam Butt,et al.  Urdu Correlatives : Theoretical and Implementational Issues , 2007 .

[12]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[13]  Keikichi Hirose,et al.  A system for synthesizing Japanese speech from orthographic text , 1990, International Conference on Acoustics, Speech, and Signal Processing.