OLAP and bibliographic databases

The application of online analytical processing (OLAP) technology to bibliographic databases is addressed. We show that OLAP tools can be used by librarians for periodic and ad hoc reporting, quality assurance, and data integrity checking, as well as by research policy makers for monitoring the development of science and evaluating or comparing disciplines, fields or research groups. It is argued that traditional relational database management systems, used mainly for day-to-day data storage and transactional processing, are not appropriate for performing such tasks on a regular basis. For the purpose, a fully functional OLAP solution has been implemented on Biomedicina Slovenica, a Slovenian national bibliographic database. We demonstrate the system's usefulness by extracting data for studying a selection of scientometric issues: changes in the number of published papers, citations and pure citations over time, their dependence on the number of co-operating authors and on the number of organisations the authors are affiliated to, and time-patterns of citations. Hardware, software and feasibility considerations are discussed and the phases of the process of developing bibliographic OLAP applications are outlined.

[1]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[2]  Diana Hicks,et al.  Desktop scientometrics , 2006, Scientometrics.

[3]  Sunita Sarawagi,et al.  Modeling multidimensional databases , 1997, Proceedings 13th International Conference on Data Engineering.

[4]  Quentin L. Burrell,et al.  Modelling citation age data: Simple graphical methods from reliability theory , 2002, Scientometrics.

[5]  James W. Marcum From Information Center to Discovery System: Next Step for Libraries?. , 2001 .

[6]  Vidette Poe Building a Data Warehouse for Decision Support , 1995 .

[7]  I. K. Ravichandra Rao,et al.  Citation Age Data and the Obsolescence Function: Fits and Explanations , 1992, Inf. Process. Manag..

[8]  Michael D. Gordon,et al.  Literature-based discovery by lexical statistics , 1999 .

[9]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[10]  Erik Thomsen,et al.  OLAP Solutions - Building Multidimensional Information Systems , 1997 .

[11]  Leo Egghe,et al.  Lectures on informetrics and scientometrics , 2000 .

[12]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[13]  Michael D. Gordon,et al.  Toward Discovery Support Systems: A Replication, Re-Examination, and Extension of Swanson's Work on Literature-Based Discovery of a Connection between Raynaud's and Fish Oil , 1996, J. Am. Soc. Inf. Sci..

[14]  D. Swanson Fish Oil, Raynaud's Syndrome, and Undiscovered Public Knowledge , 2015, Perspectives in biology and medicine.

[15]  Saso Dzeroski,et al.  Supporting Discovery in Medicine by Association Rule Mining in Medline and UMLS , 2001, MedInfo.

[16]  Don R. Swanson,et al.  Online search for logically-related noninteractive medical literatures: A systematic trial-and-error strategy , 1989, JASIS.

[17]  Leo Egghe,et al.  Aging, obsolescence, impact, growth, and utilization: definitions and relations , 2000 .

[18]  George A. Barnett,et al.  A Mathematical Model of Academic Citation Age , 1989 .