Split and rule algorithm for documents clustering in big data of research articles on Google scholar

Big data of digital documents must be ranked in online repositories as a result of the exponential rise in digital information and the user’s needs. The ranking process plays an important role in online repositories as it helps users to identify the document, what they want exactly. Various ranking techniques have been suggested on the basis of various measures, such as the number of citations of the journal article, the impact factor of the publication platform, the quality of the article, the published year of the article, bookmarks, etc. However, the current ranking algorithms often offer meaningless results due to some limitations, which suggest the potential for further development of ranking mechanisms. This paper proposes an efficient split and rule algorithm that uses both static and dynamic ranking of documents in Google scholar. The proposed algorithm uses paper citations, user input, and the clustering mechanism for document ranking. The optimized solution obtained from the proposed split and rule algorithm offers a cluster-shaped filtered search result list against the user query.

[1]  Yu Tian,et al.  Design and Development of a Medical Big Data Processing System Based on Hadoop , 2015, Journal of Medical Systems.

[2]  Aman Jain,et al.  Information Retrieval using Cosine and Jaccard Similarity Measures in Vector Space Model , 2017 .

[3]  Neelam Duhan,et al.  A Novel Approach for Document Ranking in Digital Libraries using Extractive Summarization , 2013 .

[4]  Panagiota Galetsi,et al.  Big data analytics in health sector: Theoretical framework, techniques and prospects , 2020, Int. J. Inf. Manag..

[5]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[6]  Optimization of Distributed Generation in Micro Grid using a Hybrid Metaheuristic Technique , 2020, International Journal of Emerging Trends in Engineering Research.

[7]  Enrique Orduña-Malea,et al.  Google Scholar: the 'big data' bibliographic tool , 2018, ArXiv.

[8]  Shu-Ching Chen,et al.  Computational Health Informatics in the Big Data Age , 2016, ACM Comput. Surv..

[9]  Péter Jacsó,et al.  Academic Search Engines: A Quantitative Outlook , 2015, Online Inf. Rev..

[10]  Jan vom Brocke,et al.  How Big Data Analytics Enables Service Innovation: Materiality, Affordance, and the Individualization of Service , 2018, J. Manag. Inf. Syst..

[11]  Meikang Qiu,et al.  Health-CPS: Healthcare Cyber-Physical System Assisted by Cloud and Big Data , 2017, IEEE Systems Journal.

[12]  Martin Rajman,et al.  Ranking Scientific Publications Based on Their Citation Graph , 2009 .

[13]  Ivo D. Dinov,et al.  Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data , 2016, GigaScience.

[14]  Key Phrase Extraction by Term Clustering using Proposed Graph Based Ranking Model Method , 2020, International Journal of Emerging Trends in Engineering Research.

[15]  Hyejung Chang,et al.  Interactive Visualization of Healthcare Data Using Tableau , 2017, Healthcare informatics research.

[16]  Santhosh Kasi,et al.  Operation Cost Minimization of Micro Grid using Particle Swarm Optimizer and Eagle Strategy Micro Grid's Operation Cost Minimization using PSO and ES , 2020, 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC).

[17]  Chaowei Yang,et al.  Utilizing Cloud Computing to address big geospatial data challenges , 2017, Comput. Environ. Urban Syst..

[18]  Declan Butler,et al.  Science searches shift up a gear as Google starts Scholar engine , 2004, Nature.

[19]  Feng Xia,et al.  Scientific Paper Recommendation: A Survey , 2020, IEEE Access.