Web Information Systems Engineering – WISE 2018

Cross-domain algorithms which aim to transfer knowledge available in the source domains to the target domain are gradually becoming more attractive as an effective approach to help improve quality of recommendations and to alleviate the problems of cold-start and data sparsity in recommendation systems. However, existing works on cross-domain algorithm mostly consider ratings, tags and the text information like reviews, and don’t take advantage of the sentiments implicated in the reviews efficiently, especially the negative sentiment information which is easy to be weakened during the process of transferring. In this paper, we propose a sentiment-aware review feature mapping framework for cross-domain recommendation, called SARFM. The proposed SARFM framework applies deep learning algorithm SDAE (Stacked Denoising Autoencoders) to model the Sentiment-Aware Review Feature (SARF) of users, and transfers SARF via a multi-layer perceptron to capture the nonlinear mapping function across domains. We evaluate and compare our framework on a set of Amazon datasets. Extensive experiments on each crossdomain recommendation scenarios are conducted to prove the high accuracy of our proposed SARFM framework.

[1]  Sudipto Guha,et al.  Streaming-data algorithms for high-quality clustering , 2002, Proceedings 18th International Conference on Data Engineering.

[2]  Hao Huang,et al.  Streaming spectral clustering , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[3]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[4]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[5]  Piotr Duda,et al.  How to adjust an ensemble size in stream data mining? , 2017, Inf. Sci..

[6]  Jason J. Jung,et al.  Real-time Event Detection on Social Data Stream , 2014, Mobile Networks and Applications.

[7]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[8]  Sudipto Guha,et al.  Clustering Data Streams , 2000, FOCS.

[9]  Li Tu,et al.  Density-based clustering for real-time stream data , 2007, KDD '07.

[10]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  Sanja Fidler,et al.  Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Petros Daras,et al.  Search and Retrieval of Rich Media Objects Supporting Multiple Multimodal Queries , 2012, IEEE Transactions on Multimedia.

[14]  Philip S. Yu,et al.  A Framework for Projected Clustering of High Dimensional Data Streams , 2004, VLDB.

[15]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[16]  Dmitry Namiot,et al.  On Big Data Stream Processing , 2015 .

[17]  Aoying Zhou,et al.  Density-Based Clustering over an Evolving Data Stream with Noise , 2006, SDM.

[18]  Jin-Yin Chen,et al.  A fast density-based data stream clustering algorithm with cluster centers self-determined for mixed data , 2016, Inf. Sci..

[19]  Jennifer Widom,et al.  STREAM: The Stanford Data Stream Management System , 2016, Data Stream Management.

[20]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[21]  Latifur Khan,et al.  IoT Big Data Stream Mining , 2016, KDD.

[22]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[23]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[24]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[25]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[26]  J. Hartigan,et al.  The Dip Test of Unimodality , 1985 .

[27]  O. P. Vyas,et al.  Data Stream Mining: A Review on Windowing Approach , 2012 .

[28]  Sharma Chakravarthy,et al.  Clustering data streams using grid-based synopsis , 2013, Knowledge and Information Systems.

[29]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[30]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Dai Dong Effective Clustering Algorithm for Probabilistic Data Stream , 2009 .

[32]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[33]  Alessandro Margara,et al.  Processing flows of information: From data stream to complex event processing , 2012, CSUR.

[34]  Claudia Plant,et al.  Skinny-dip: Clustering in a Sea of Noise , 2016, KDD.