Gender and Age Identification Through Romanized Urdu Dataset

Urdu ranks very high among languages used for communication in the Southern Asia. Even though with great following, it clearly lack computational support that is why it is written in Romanized Urdu script. There has been a lot of research done on the gender and age identification of author through written text but not ample have been done using Romanized Urdu dataset. In our research, we have proposed a model for the said purpose by identifying key parameter (defined attributes) of an author. These parameters were measured for both the genders and three categories of age. Weight assignment technique was used to plot graphs which help in computation of the desired results.

[1]  Richard Dazeley,et al.  Authorship Attribution for Twitter in 140 Characters or Less , 2010, 2010 Second Cybercrime and Trustworthy Computing Workshop.

[2]  Cathy Zhang,et al.  Predicting gender from blog posts , 2010 .

[3]  Dong Nguyen,et al.  "How Old Do You Think I Am?" A Study of Language and Age in Twitter , 2013, ICWSM.

[4]  Sara Rosenthal,et al.  Age Prediction in Blogs: A Study of Style, Content, and Online Behavior in Pre- and Post-Social Media Generations , 2011, ACL.

[5]  Sudeshna Sarkar,et al.  Stylometric Analysis of Bloggers' Age and Gender , 2009, ICWSM.

[6]  Rong Zheng,et al.  A framework for authorship identification of online messages: Writing-style features and classification techniques , 2006, J. Assoc. Inf. Sci. Technol..

[7]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[8]  Jean Aitchison,et al.  Language and the Internet , 2002, Lit. Linguistic Comput..

[9]  Carolyn Penstein Rosé,et al.  Author Age Prediction from Text using Linear Regression , 2011, LaTeCH@ACL.

[10]  John D. Burger,et al.  An Exploration of Observable Features Related to Blogger Age , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[11]  Rajarathnam Chandramouli,et al.  Gender identification from E-mails , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[12]  Benno Stein,et al.  Overview of the Author Profiling Task at PAN 2013 , 2013, CLEF.

[13]  Walter Daelemans,et al.  Predicting age and gender in online social networks , 2011, SMUC '11.

[14]  Paolo Rosso,et al.  Use of Language and Author Profiling : Identification of Gender and Age , 2013 .