Multilingual Information Retrieval and Smart News Feed Based on Big Data

Based on the study of linguistics, Information Science and Library and Information Science, we research on the real-time news posted on the authority sites in the world's major countries. By analyzing the massive news of different information sources and language origins, we come up with a basic theory model and its algorithm on news, which is capable of intelligent collection, quick access, deduplication, correction and integration with news' backgrounds. Furthermore, we can find out connections between news and readers' interest. So we can achieve a real-time and on-demand news feed as well as provide a theoretical basis and verification of scientific problems on real-time processing of massive information. Finally, the simulation experiment shows that the multilingual news matching technology could give more help to distinguish the similar news in different languages than the traditional method.

[1]  Xuan Zhou,et al.  Architecting Big Data: Challenges, Studies and Forecasts: Architecting Big Data: Challenges, Studies and Forecasts , 2011 .

[2]  Shan Wang,et al.  New Landscape of Data Management Technologies: New Landscape of Data Management Technologies , 2013 .

[3]  Wang Shan,et al.  Architecting Big Data:Challenges,Studies and Forecasts , 2011 .

[4]  Alexandros Labrinidis,et al.  Challenges and Opportunities with Big Data , 2012, Proc. VLDB Endow..

[5]  Qin Xiong,et al.  New Landscape of Data Management Technologies , 2013 .

[6]  Moe Key,et al.  Big Data Analysis—Competition and Symbiosis of RDBMS and MapReduce , 2012 .

[7]  Meng Xiaofeng and Ci Xiang,et al.  Big Data Management: Concepts,Techniques and Challenges , 2013 .

[8]  Shan Wang,et al.  Big Data Analysis—Competition and Symbiosis of RDBMS and MapReduce: Big Data Analysis—Competition and Symbiosis of RDBMS and MapReduce , 2012 .

[9]  B. A. Kumar Profound Survey on Cross Language Information Retrieval Methods (CLIR) , 2012, 2012 Second International Conference on Advanced Computing & Communication Technologies.

[10]  Yue Liu,et al.  A Cross Language Text Categorization Algorithm from the Perspective of Information Retrieval , 2012, 2012 International Conference on Industrial Control and Electronics Engineering.

[11]  M Aswatha Kumar,et al.  Proceedings of International Conference on Advances in Computing , 2012 .

[12]  Wang Shan,et al.  New Landscape of Data Management Technologies , 2013 .

[13]  Bofeng Zhang,et al.  Cross Language Information Extraction for Digitized Textbooks of Specific Domains , 2012, 2012 IEEE 12th International Conference on Computer and Information Technology.