Spell Checker for Punjabi Language Using Deep Neural Network

Spell Checker is an important part of language specific text processing applications. In this paper, we propose a novel hybrid approach for spell checker for Punjabi language. Available spell checkers use traditional method for error detection such as dictionary lookup technique and minimum edit distance for error correction. The proposed spell checker aims to improve performance as well accuracy. In this paper, we present the use of trie data structure to store Punjabi words dictionary and then use tree based algorithm along with n-gram analysis to detect misspelled words. To correct misspelled words, best possible suggestion is listed using Long-short term memory (LSTM) recurrent neural network along with rule based approach and minimum edit distance. Error detection using trie-based dictionary improves the performance and Error correction using LSTM improves the accuracy of proposed spell checker. In addition to error detection techniques and error correction techniques, proposed spell checker uses handcrafted rules, language syntax rules and rules regarding tokenization.

[1]  Navroop Kaur,et al.  SPELL CHECKING AND ERROR CORRECTING SYSTEM FOR TEXT PARAGRAPHS WRITTEN IN PUNJABI LANGUAGE USING HYBRID APPROACH , 2016 .

[2]  Rasha Al-tarawneh Spelling Detection Errors Techniques in NLP: A Survey , 2017 .

[3]  Biswajit Sarma,et al.  Assamese Spell Checker Design and Implementation , 2016 .

[4]  Suleyman S. Kozat,et al.  Online Training of LSTM Networks in Distributed Systems for Variable Length Data Sequences , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Harsharndeep Singh,et al.  Design and Implementation of HINSPELL -Hindi Spell Checker using Hybrid approach , 2015 .

[6]  Parminder Singh,et al.  Punjabi Dialects Conversion System for Malwai and Doabi Dialects , 2015 .

[7]  Beng Chin Ooi,et al.  Efficiently Supporting Edit Distance Based String Similarity Search Using B $^+$-Trees , 2014, IEEE Trans. Knowl. Data Eng..

[8]  Nwankwo Nonso Prince Real - World Applications of Neural Network , 2011 .

[9]  Gurpreet Singh Lehal,et al.  Design and Implementation of Shahmukhi Spell Checker , 2015 .

[10]  Manolito Octaviano,et al.  A spell checker for a low-resourced and morphologically rich language , 2017, TENCON 2017 - 2017 IEEE Region 10 Conference.

[11]  Per Ola Kristensson,et al.  Neural Networks for Text Correction and Completion in Keyboard Decoding , 2017, ArXiv.

[12]  Harpreet Kaur,et al.  Punjabi Spell Checker Using Dictionary Clustering , 2015 .

[13]  Zhaohui Wu,et al.  Deep Learning of Graphs with Ngram Convolutional Neural Networks , 2017, IEEE Transactions on Knowledge and Data Engineering.

[14]  Suleyman Serdar Kozat,et al.  Efficient Online Learning Algorithms Based on LSTM Neural Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Son N. Tran,et al.  Deep Logic Networks: Inserting and Extracting Knowledge From Deep Belief Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[17]  Hermann Ney,et al.  From Feedforward to Recurrent LSTM Neural Networks for Language Modeling , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[18]  Meenakshi Sharma,et al.  SPELL CHECKING AND ERROR CORRECTING SYSTEM FOR TEXT PARAGRAPHS WRITTEN IN PUNJABI OR HINDI LANGUAGEUSING HYBRID APPROACH , 2016 .

[19]  Sinnathamby Mahesan,et al.  A novel hybrid approach to detect and correct spelling in Tamil text , 2016, 2016 IEEE International Conference on Information and Automation for Sustainability (ICIAfS).

[20]  Navjot Kaur,et al.  A Survey of Spelling Error Detection and Correction Techniques , 2013 .

[21]  Anand Raghunathan,et al.  Approximate Computing for Long Short Term Memory (LSTM) Neural Networks , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.