Stylometry detection using deep learning

Author profiling is one of the active researches in the field of data mining. Rather than only concentrated on the syntactic as well as stylometric features, this paper describes about more relevant features which will profile the authors more accurately. Readability metrics, vocabulary richness, and emotional status are the features which are taken into consideration. Age and gender are detected as the metrics for author profiling. Stylometry is defined by using deep learning algorithm. This approach has attained an accuracy of 97.7% for gender and 90.1% for age prediction.

[1]  John D. Burger,et al.  Discriminating Gender on Twitter , 2011, EMNLP.

[2]  Dong Nguyen,et al.  "How Old Do You Think I Am?" A Study of Language and Age in Twitter , 2013, ICWSM.

[3]  Zachary Miller,et al.  Gender Identification on Twitter Using the Modified Balanced Winnow , 2012 .

[4]  Cecilia Ovesdotter Alm,et al.  Toward inferring the age of Twitter users with their use of nonstandard abbreviations and lexicon , 2014, Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014).

[5]  Walter Daelemans,et al.  Predicting age and gender in online social networks , 2011, SMUC '11.

[6]  Cemal Köse,et al.  Identifying gender, age and Education level by analyzing comments on Facebook , 2013, 2013 21st Signal Processing and Communications Applications Conference (SIU).

[7]  Rajarathnam Chandramouli,et al.  Author gender identification from text , 2011, Digit. Investig..

[8]  Toni Schmader,et al.  Gender Identification Moderates Stereotype Threat Effects on Women's Math Performance ☆ ☆☆ ★ , 2002 .

[9]  Dong Nguyen,et al.  Why Gender and Age Prediction from Tweets is Hard: Lessons from a Crowdsourcing Experiment , 2014, COLING.

[10]  Golnoosh Farnadi,et al.  Age, Gender and Personality Recognition using Tweets in a Multilingual setting , 2015, CLEF 2015.

[11]  Maarten Sap,et al.  Developing Age and Gender Predictive Lexica over Social Media , 2014, EMNLP.

[12]  Hassiba Nemmour,et al.  Age, gender and handedness prediction from handwriting using gradient features , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[13]  David Yarowsky,et al.  Classifying latent user attributes in twitter , 2010, SMUC '10.

[14]  Zachary Miller,et al.  Gender Prediction on Twitter Using Stream Algorithms with N-Gram Character Features , 2012 .

[15]  Jyh-Shing Roger Jang,et al.  Gender Identification and Age Estimation of Users Based on Music Metadata , 2014, ISMIR.

[16]  Krista Ratcliffe,et al.  Rhetorical Listening: Identification, Gender, Whiteness , 2005 .