New E-Commerce User Interest Patterns

The number of online purchases is increasing constantly. Companies have recognized the related opportunities and they are using online channels progressively. In order to acquire potential customers, companies often try to gain a better understanding through the use of web analytics. One of the most useful sources are web log files. Basically, these provide an abundance of important information about the user behavior on a website, such as the path or access time. Mining this so-called clickstream data in the most comprehensive way has become an important task in order to predict the behavior of online customers, optimize webpages, and give personalized recommendations. As the number of customers constantly rises, the volume of the generated data log files also increases, both in terms of size and quantity. Thus, for certain companies, the currently used technologies are no longer sufficient. In this work, a comprehensive workflow will be proposed using a clustering algorithm in a Hadoop ecosystem to investigate user interest patterns. The complete workflow will be demonstrated on an application scenario of one of the largest business-to-business (B2B) electronic commerce websites in Germany. Furthermore, an experimental evaluation method will be applied to verify the applicability and efficiency of the used algorithm, along with the associated framework.

[1]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[2]  Luis Aguiar,et al.  Institute for Prospective Technological Studies Digital Economy Working Paper 2013/04 Digital Music Consumption on the Internet: Evidence from Clickstream Data , 2022 .

[3]  Ling Zheng,et al.  User interest modeling based on browsing behavior , 2010, 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE).

[4]  Yunhao Liu,et al.  Big Data: A Survey , 2014, Mob. Networks Appl..

[5]  Sascha Bosse,et al.  How much is Big Data? A Classification Framework for IT Projects and Technologies , 2016, AMCIS.

[6]  Rajkumar Buyya,et al.  Big Data computing and clouds: Trends and future directions , 2013, J. Parallel Distributed Comput..

[7]  Arlene Fink,et al.  Conducting research literature reviews : from the internet to paper , 2014 .

[8]  R. Bucklin,et al.  Modeling Purchase Behavior at an E-Commerce Web Site: A Task-Completion Approach , 2004 .

[9]  Yonggang Wen,et al.  Toward Scalable Systems for Big Data Analytics: A Technology Tutorial , 2014, IEEE Access.

[10]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[11]  Lu Chen,et al.  A method for discovering clusters of e-commerce interest patterns using click-stream data , 2015, Electron. Commer. Res. Appl..

[12]  Aravind Kumaresan Framework for Building a Big Data Platform for Publishing Industry , 2015, KMO.

[13]  Samir Chatterjee,et al.  A Design Science Research Methodology for Information Systems Research , 2008 .

[14]  Sylvain Sénécal,et al.  Consumers' decision-making process and their online shopping behavior: a clickstream analysis , 2005 .

[15]  Pradeepini Gera,et al.  A Recent Study of Emerging Tools and Technologies Boosting Big Data Analytics , 2016 .

[16]  Philipp Mayring Qualitative Inhaltsanalyse : Grundlagen und Techniken , 2003 .

[17]  Thomas Hansmann,et al.  Big Data - Characterizing an Emerging Research Field Using Topic Models , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[18]  Gokhan Silahtaroglu,et al.  Analysis and prediction of Ε-customers' behavior by mining clickstream data , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[19]  Wendy W. Moe,et al.  The Influence of Goal‐Directed and Experiential Activities on Online Flow Experiences , 2003 .

[20]  You-Jin Park,et al.  Individual and group behavior-based customer profile model for personalized product recommendation , 2009, Expert Syst. Appl..

[21]  Bong-Jin Yum,et al.  Recommender system based on click stream data using association rule mining , 2011, Expert Syst. Appl..

[22]  Edith Schonberg,et al.  Visualization and Analysis of Clickstream Data of Online Stores for Understanding Web Merchandising , 2004, Data Mining and Knowledge Discovery.

[23]  Peter S. Fader,et al.  Dynamic Conversion Behavior at E-Commerce Sites , 2004, Manag. Sci..

[24]  Jaideep Srivastava,et al.  Discovery of Interesting Usage Patterns from Web Data , 1999, WEBKDD.

[25]  Hong Yu,et al.  A novel possibilistic fuzzy leader clustering algorithm , 2011, Int. J. Hybrid Intell. Syst..

[26]  K. Thangavel,et al.  A Fuzzy Co-Clustering approach for Clickstream Data Pattern , 2011, ArXiv.

[27]  Giner Alor-Hernández,et al.  A general perspective of Big Data: applications, tools, challenges and trends , 2015, The Journal of Supercomputing.

[28]  Martin Bichler,et al.  Design science in information systems research , 2006, Wirtschaftsinf..

[29]  Daniel Pakkala,et al.  Reference Architecture and Classification of Technologies, Products and Services for Big Data Systems , 2015, Big Data Res..

[30]  N. B. Anuar,et al.  The rise of "big data" on cloud computing: Review and open research issues , 2015, Inf. Syst..

[31]  Ripon Patgiri,et al.  A Survey of Different Technologies and Recent Challenges of Big Data , 2016 .

[32]  Jason J. Jung,et al.  Social big data: Recent achievements and new challenges , 2015, Information Fusion.

[33]  Dirk Van den Poel,et al.  Predicting online-purchasing behaviour , 2005, Eur. J. Oper. Res..

[34]  Volodymyr Melnykov,et al.  Model-based biclustering of clickstream data , 2016, Comput. Stat. Data Anal..

[35]  Shijie Cheng,et al.  Study of the key technologies of electric power big data and its application prospects in smart grid , 2014, 2014 IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC).

[36]  Gang Wang,et al.  Unsupervised Clickstream Clustering for User Behavior Analysis , 2016, CHI.