In this work we study the behavior of users on online comparison shopping using session traces collected over one year from an Indian mobile phone comparison website: http://smartprix.com. There are two aspects to our study: data analysis and behavior prediction. The first aspect of our study, data analysis, is geared towards providing insights into user behavior that could enable vendors to offer the right kinds of products and prices, and that could help the comparison shopping engine to customize the search based on user preferences. We discover the correlation between the search queries which users write before coming on the site and their future behavior on the same. We have also studied the distribution of users based on geographic location, time of the day, day of the week, number of sessions which have a click to buy (convert), repeat users, phones/brands visited and compared. We analyze the impact of price change on the popularity of a product and how special events such as launch of a new model affect the popularity of a brand. Our analysis corroborates intuitions such as increasing price leads to decrease in popularity and vice-versa. Further, we characterize the time lag in the effect of such phenomena on popularity. We characterize the user behavior on the website in terms of sequence of transitions between multiple states (defined in terms of the kind of page being visited e.g. home, visit, compare etc.). We use KL divergence to show that a time-homogeneous Markov chain is the right model for session traces when the number of clicks varies from 5 to 30. Finally, we build a model using Markov logic that uses the history of the user's activity in a session to predict whether a user is going to click to convert in that session. Our methodology of combining data analysis with machine learning is, in our opinion, a new approach to the empirical study of such data sets.
[1]
Wei-Yin Loh,et al.
Classification and regression trees
,
2011,
WIREs Data Mining Knowl. Discov..
[2]
Pedro M. Domingos,et al.
Sound and Efficient Inference with Probabilistic and Deterministic Dependencies
,
2006,
AAAI.
[3]
Matthew Richardson,et al.
The Alchemy System for Statistical Relational AI: User Manual
,
2007
.
[4]
P. Chatterjee,et al.
Online Comparison Shopping Behavior of Travel Consumers
,
2012
.
[5]
Pedro M. Domingos,et al.
Markov Logic: An Interface Layer for Artificial Intelligence
,
2009,
Markov Logic: An Interface Layer for Artificial Intelligence.
[6]
Radford M. Neal.
Pattern Recognition and Machine Learning
,
2007,
Technometrics.
[7]
Carla E. Brodley,et al.
KDD-Cup 2000 organizers' report: peeling the onion
,
2000,
SKDD.
[8]
Peter Green,et al.
Markov chain Monte Carlo in Practice
,
1996
.
[9]
Pedro M. Domingos,et al.
Entity Resolution with Markov Logic
,
2006,
Sixth International Conference on Data Mining (ICDM'06).