An On-line Analytical Processing (OLAP) Aggregation Function for Rising Aspects in Collaboration and Social Networks

The overwhelming usage of social and collaboration networks provides the opportunity to analyze evolution of relationships among individuals, like celebrities or co-authors. Discovering such phenomenon in large complex networks is non-trivial due to their large sizes. In this situation, the aggregation functions used in OLAP, are useful to analyze the summarized data. OLAP has successfully proven its worth on multidimensional or complex networks. However, existing aggregations in the current OLAP systems do not produce versatile results in case of social and collaboration networks. This happens because said type of networks have structural connectivity/links among nodes, which cannot be considered by OLAP during its execution. In this situation, a useful discovery in terms of identifying pairs of nodes whose relationships is emerging in recent time, is missed. Such discovery of pairs of nodes is important for various applications such as targeted marketing, future joint partnerships and predicting future correspondence to name a few. In this study, we call such pairs as Rising_Pairs and propose an aggregation function for performing OLAP on network data whose historical information is maintained over a period of times. Using structural information, Rising_Pairs, our proposed aggregation function, discovers the strongly coupled pairs in a network data by emphasizing their recent interactions and attribute similarities. In this way, useful information related to strongly coupled pairs in a network is identified. To verify the effectiveness of our proposal, we implemented it on various types of real-world networks like Facebook, Digital Bibliography and Library Project (DBLP) and Global Positioning System (GPS) trajectory datasets and observed interesting patterns.

[1]  David Wai-Lok Cheung,et al.  S-OLAP: an OLAP system for analyzing sequence data , 2010, SIGMOD Conference.

[2]  Jianzhong Li,et al.  Summarizing Graph Patterns , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[3]  Cailing Dong,et al.  Analysis of Computer Science Communities Based on DBLP , 2010, ECDL.

[4]  M. Jackson,et al.  The Effects of Social Networks on Employment and Inequality , 2004 .

[5]  Vincenzo Moscato,et al.  A Tool for Researchers: Querying Big Scholarly Data Through Graph Databases , 2019, ECML/PKDD.

[6]  A. J. Lawrance,et al.  An exponential moving-average sequence and point process (EMA1) , 1977, Journal of Applied Probability.

[7]  Evimaria Terzi,et al.  GraSS: Graph Structure Summarization , 2010, SDM.

[8]  Nagendra Kumar,et al.  Generating Topics of Interests for Research Communities , 2017, ADMA.

[9]  Nisheeth Shrivastava,et al.  Graph summarization with bounded error , 2008, SIGMOD Conference.

[10]  Jignesh M. Patel,et al.  Efficient aggregation for graph summarization , 2008, SIGMOD Conference.

[11]  Christoph Lange,et al.  Metadata Analysis of Scholarly Events of Computer Science, Physics, Engineering, and Mathematics , 2018, TPDL.

[12]  David Wai-Lok Cheung,et al.  OLAP on sequence data , 2008, SIGMOD Conference.

[13]  Jennifer Ortiz,et al.  Clustering with the DBLP Bibliography to Measure External Impact of a Computer Science Research Area , 2014 .

[14]  A. Ueda Reputation in computer science on a per subarea basis , 2017 .

[15]  Brian D. Davison,et al.  Academic network analysis: A joint topic modeling approach , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[16]  Young-Koo Lee,et al.  Rising_Pairs: An OLAP Aggregation Function for Social Networks , 2012 .

[17]  Bernhard Thalheim,et al.  OLAP databases and aggregation functions , 2001, Proceedings Thirteenth International Conference on Scientific and Statistical Database Management. SSDBM 2001.

[18]  Vincenzo Moscato,et al.  GraphDBLP Released: Querying the Computer Scientists Network as a Graph , 2018, SEBD.

[19]  Vincenzo Moscato,et al.  GraphDBLP: a system for analysing networks of computer scientists through graph databases , 2017, Multimedia Tools and Applications.

[20]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[21]  Philip S. Yu,et al.  Efficient Topological OLAP on Information Networks , 2011, DASFAA.

[22]  Olivier Teste,et al.  Olap aggregation function for textual data warehouse , 2016, ICEIS.

[23]  Shashi Shekhar,et al.  Time-Aggregated Graphs for Modeling Spatio-temporal Networks , 2006, J. Data Semant..

[24]  Jiawei Han,et al.  Graph cube: on warehousing and OLAP multidimensional networks , 2011, SIGMOD '11.

[25]  Vincenzo Moscato,et al.  DICO: A Graph-DB Framework for Community Detection on Big Scholarly Data , 2019, IEEE Transactions on Emerging Topics in Computing.

[26]  Michael Burch,et al.  An Analysis and Visualization Tool for DBLP Data , 2015, 2015 19th International Conference on Information Visualisation.

[27]  Olivier Teste,et al.  Top_Keyword: An Aggregation Function for Textual Document OLAP , 2008, DaWaK.

[28]  Hans-Georg Kemper,et al.  Management Support with Structured and Unstructured Data—An Integrated Business Intelligence Framework , 2008, Inf. Syst. Manag..

[29]  Brian K. Ryu The Demise of Single-Authored Publications in Computer Science: A Citation Network Analysis , 2020, ArXiv.

[30]  Min Song,et al.  Analyzing topic evolution in bioinformatics: investigation of dynamics of the field with conference data in DBLP , 2014, Scientometrics.

[31]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[32]  Marika Apostolova Trpkovska,et al.  Investigating Gender Gap in Computer Science Research Community , 2019, 2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[33]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[34]  Philip S. Yu,et al.  Graph OLAP: Towards Online Analytical Processing on Graphs , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[35]  Francesco Guerra,et al.  Keyword search in structured data and network analysis: A preliminary experiment over DBLP , 2015, 2015 10th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP).

[36]  D. Manjula,et al.  A fast approach to identify trending articles in hot topics from XML based big bibliographic datasets , 2016, Cluster Computing.

[37]  Leonid Keselman,et al.  Venue Analytics: A Simple Alternative to Citation-Based Metrics , 2019, 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[38]  Jordi Cabot,et al.  Are CS conferences (too) closed communities? , 2018, Commun. ACM.

[39]  Patricia G. Lange Publicly Private and Privately Public: Social Networking on YouTube , 2007, J. Comput. Mediat. Commun..

[40]  Jon Crowcroft,et al.  Network analysis of temporal trends in scholarly research productivity , 2012, J. Informetrics.

[41]  Balaji Rajagopalan,et al.  Knowledge-sharing and influence in online social networks via viral marketing , 2003, CACM.

[42]  Farnoush Banaei Kashani,et al.  Efficient Reachability Query Evaluation in Large Spatiotemporal Contact Datasets , 2012, Proc. VLDB Endow..

[43]  Hong Shen,et al.  Anonymizing Graphs Against Weight-based Attacks with Community Preservation , 2011, J. Comput. Sci. Eng..

[44]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[45]  Bruno Martins,et al.  Learning to rank academic experts in the DBLP dataset , 2015, Expert Syst. J. Knowl. Eng..

[46]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[47]  G LangePatricia Publicly Private and Privately Public , 2007 .

[48]  Yannis Kotidis,et al.  Using entropy metrics for pruning very large graph cubes , 2019, Inf. Syst..

[49]  Jimeng Sun,et al.  Social influence analysis in large-scale networks , 2009, KDD.

[50]  Peter Gluchowski,et al.  Data Warehouse , 1997, Informatik-Spektrum.

[51]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[52]  Václav Snásel,et al.  Evolution of Co-Authors Communities Formed by Terms on DBLP , 2013, DATESO.

[53]  Esteban Zimányi,et al.  TopoGraph: an End-To-End Framework to Build and Analyze Graph Cubes , 2020, Inf. Syst. Frontiers.

[54]  Drahomira Herrmannova,et al.  Mining scholarly publications for research evaluation , 2018 .

[55]  Eytan Adar,et al.  Implicit Structure and the Dynamics of Blogspace , 2004 .

[56]  Ana Carolina Salgado,et al.  A Review on OLAP Technologies Applied to Information Networks , 2019, ACM Trans. Knowl. Discov. Data.