User Age Profile Assessment Using SMS Network Neighbors' Age Profiles

Customer profile data used in information systems such as recommender systems and collaborative customer relationship management system should be reliable. However, it is hard to maintain high quality of customer profile data because profile information is usually self-reported by users who do not always want to throw their profiles to the company. This paper presents a study of user profile reliability assessment using homophily in a large-scale mobile SMS (short messaging service) data. Our research provides a simple statistical method to find out users' true profiles based on profile information of users' neighbors in social network. Our dataset contains randomly selected 117,333 user data from a larg Korean mobile company, including users' demographic profiles and their text communication histories. Using the text network data, we construct social network. Results show that our method efficiently identifies users with great discrepancy between reported age and actual age. In particular, the prediction accuracy for a user's actual age by our method is 94.4% which is very high compared to 86.5%, the second best accuracy by the simple relational inference approach. The results imply that our age profile assessment model can verify whether a user's age profile is reliable or not and can be applied in practical use.