This research aims at the development of a knowledge representation that will elucidate and visualize the differences and similarities between concepts expressed in different languages and cultures. Wikipedia graph structure is considered around one concept namely “Nazism” in two languages, English and German for the purpose of understanding how online knowledge crowdsourcing platforms will be affected by different language groups and their cultures. The solution is divided into capturing structure of weighted graph representation learning via random surfing, cross-lingual document similarity via Jaccard similarity, multi-view representation learning by deploying Deep Canonical Correlation Autoencoder (DCCAE) and sentiment classification task via SVM. Our method shows superior performance on word similarity task. Based on our best knowledge, it is the first application of DCCAE in this context.
[1]
Hugo Larochelle,et al.
An Autoencoder Approach to Learning Bilingual Word Representations
,
2014,
NIPS.
[2]
Manaal Faruqui,et al.
Improving Vector Space Word Representations Using Multilingual Correlation
,
2014,
EACL.
[3]
Vladimir Eidelman,et al.
cdec: A Decoder, Alignment, and Learning Framework for Finite- State and Context-Free Translation Models
,
2010,
ACL.
[4]
Peter A. Gloor,et al.
Cultural Differences in the Understanding of History on Wikipedia
,
2016
.
[5]
Ke Jiang,et al.
Mapping Articles on China in Wikipedia: An Inter-Language Semantic Network Analysis
,
2017,
HICSS.
[6]
Wei Lu,et al.
Deep Neural Networks for Learning Graph Representations
,
2016,
AAAI.
[7]
Jeff A. Bilmes,et al.
On Deep Multi-View Representation Learning
,
2015,
ICML.