NLPRL@INLI-2018: Hybrid gated LSTM-CNN model for Indian Native Language Identification

Native language identification (NLI) focuses on determining the native language of the author based on the writing style in English. Indian native language identification is a challenging task based on users comments and posts on social media. To solve this problem, we present a hybrid gated LSTM-CNN model to solve this problem. The final vector of a sentence is generated at hybrid gate by joining the two distinct vector of a sentence. Gate seeks the optimum mixture of the LSTM and CNN level outputs. The input word for LSTM and CNN are projected into high-dimensional space by embedding technique. We obtained 88.50% accuracy during training on the provided social media dataset, and 17.10% is reported in the final testing done by Indian native language identification (INLI) workshop organizers.