Analysing research collaboration through co-authorship networks in a big data environment: an efficient parallel approach

Bibliometry is the quantitative study of scientific productions and enables the characterisation of scientific collaboration networks. However, with the development of science and the increase of scientific production, large collaborative networks are formed, which makes it difficult to extract bibliometrics. In this context, this work presents an efficient parallel optimisation of three bibliometrics for co-authorship network analysis using multithread programming: transitivity, average distance, and diameter. Our experiments found that the time taken to calculate the transitivity value using the sequential approach grows 4.08 times faster than the parallel proposed approach when the size of co-authorship network grows. Similarly, the time taken to calculate the average distance and diameter values using the sequential approach grows 5.27 times faster than the parallel proposed approach when the size of co-authorship network grows. In addition, we report relevant values of speed up and efficiency for the developed algorithms.