The NASA Astrophysics Data System: Architecture

The powerful discovery capabilities available in the ADS bibliographic services are possible thanks to the design of a flexible search and retrieval system based on a relational database model. Bibliographic records are stored as a corpus of structured documents containing fielded data and metadata, while discipline-specific knowledge is segregated in a set of files independent of the bibliographic data itself. This ancillary information is used by the database management software to compile field-specific index files used by the ADS search engine to resolve user queries into lists of relevant documents.
The creation and management of links to both internal and external resources associated with each bibliography in the database is made possible by representing them as a set of document properties and their attributes. The resolution of links available from different locations has been generalized to allow its control through a site- and user-specific preference database. To improve global access to the ADS data holdings, a number of mirror sites have been created by cloning the database contents and software on a variety of hardware and software platforms.
The procedures used to create and manage the database and its mirrors have been written as a set of scripts that can be run in either an interactive or unsupervised fashion. The modular approach we followed in software development has allowed a high degree of freedom in prototyping and customization, making our system rich of features and yet simple enough to be easily modified on a day-to-day basis.
We conclude discussing the impact that new datasets, technologies and collaborations is expected to have on the ADS and its possible role in an integrated environment of networked resources in astronomy.
The ADS can be accessed at:
http://adswww.harvard.edu

[1]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[2]  Donald E. Knuth,et al.  Sorting and Searching , 1973 .

[3]  Donald E. Knuth,et al.  The Art of Computer Programming, Vol. 3: Sorting and Searching , 1974 .

[4]  H. S. Heaps,et al.  Information retrieval, computational and theoretical aspects , 1978 .

[5]  Frederick Hayes-Roth,et al.  Building expert systems , 1983, Advanced book program.

[6]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[7]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[8]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[9]  T. N. Gadd,et al.  PHOENIX: the algorithm , 1990 .

[10]  Donna K. Harman,et al.  How effective is suffixing? , 1991, J. Am. Soc. Inf. Sci..

[11]  Larry Wall,et al.  Programming Perl , 1991 .

[12]  Stephen S. Murray,et al.  Intelligent text retrieval in the NASA astrophysics data system , 1992 .

[13]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[14]  Nicholas J. Belkin,et al.  Information filtering and information retrieval: two sides of the same coin? , 1992, CACM.

[15]  Venkata Subramaniam,et al.  Information Retrieval: Data Structures & Algorithms , 1992 .

[16]  R. Shobbrook,et al.  The IAU Thesaurus for Improved On-line Access to Information , 1992, Publications of the Astronomical Society of Australia.

[17]  Stephen S. Murray,et al.  ADS Abstract Service Enhancements , 1995 .

[18]  B. Madore,et al.  A uniform bibliographic code , 1995 .

[19]  R. Shobbrook The multi-lingual supplement to the astronomy thesaurus , 1995 .

[20]  Clifford A. Lynch,et al.  Interoperability, Scaling, and the Digital Libraries Research Agenda. , 1996 .

[21]  Peter Deutsch,et al.  ZLIB Compressed Data Format Specification version 3.3 , 1996, RFC.

[22]  Larry Wall,et al.  Programming Perl (2nd ed.) , 1996 .

[23]  Andy Oram,et al.  Programming with GNU software , 1996 .

[24]  B R Schatz,et al.  Information Retrieval in Digital Libraries: Bringing Search to the Net , 1997, Science.

[25]  N. Ziviani,et al.  Distributed parallel generation of indices for very large text databases , 1997, Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing.

[26]  U. Miller,et al.  Thesaurus construction: problems and their roots , 1997, Inf. Process. Manag..

[27]  Peggy Miles,et al.  Internet World: Guide to Webcasting , 1998 .

[28]  W. Bruce Croft,et al.  Corpus-based stemming using cooccurrence of word variants , 1998, TOIS.

[29]  George A. Wilkins,et al.  The Revision of UDC 52 and of the Astronomy Thesaurus , 1998 .

[30]  Marc White,et al.  StarBurst Multicast File Transfer Protocol (MFTP) Specification , 1998 .

[31]  Les Carr,et al.  Linking Electronic Journals: Lessons from the Open Journal Project , 1998, D Lib Mag..

[32]  Fionn Murtagh,et al.  Distributed Information Search and Retrieval for Astronomical Resource Discovery and Data Mining , 1998 .

[33]  Kimberly Douglas,et al.  Digital Object Identifiers: Promise and Problems for Scholarly Publishing , 1998 .

[34]  Gregory Grefenstette,et al.  Problems and Approaches to Cross Language Information Retrieval. , 1998 .

[35]  Gregory Grefenstette,et al.  Cross-Language Information Retrieval , 1998, The Springer International Series on Information Retrieval.

[36]  Andrew Tridgell,et al.  Efficient Algorithms for Sorting and Synchronization , 1999 .

[37]  William Y. Arms,et al.  Reference Linking for Journal Articles , 1999, D Lib Mag..

[38]  Stephen S. Murray,et al.  The ADS Bibliographic Reference Resolver , 1999 .

[39]  Edward J. Shaya,et al.  XML at the ADC: Steps to a Next Generation Data Archive , 1999 .

[40]  Nicholas E. White,et al.  Astrobrowse: a Web Agent for Querying Astronomical Databases , 1999 .

[41]  Kenneth P. Birman,et al.  A review of experiences with reliable multicast , 1999, Softw. Pract. Exp..

[42]  David Dubin,et al.  Co-occurrence Evidence for Subject Vocabulary Reconciliation in ADS Databases , 1999 .

[43]  Herbert Van de Sompel,et al.  Reference Linking in a Hybrid Library Environment , 1999 .

[44]  Kenneth P. Briman A review of experiences with reliable multicast , 1999 .

[45]  Bert J. Dempsey,et al.  Towards an Efficient, Scalable Replication Mechanism for the I2-DSI Project , 1999 .

[46]  Norman Paskin,et al.  DOI: Current Status and Outlook May 1999 , 1999, D-Lib Magazine.

[47]  R. J. Hanisch,et al.  Distributed Data Systems and Services for Astronomy and the Space Sciences , 2000 .

[48]  Michael J. Kurtz,et al.  The NASA Astrophysics Data System: Data holdings , 2000, astro-ph/0002103.

[49]  Michael J. Kurtz,et al.  The NASA Astrophysics Data System: The search engine and its user interface , 2000, astro-ph/0002102.

[50]  Michael J. Kurtz,et al.  The NASA Astrophysics Data System: Overview , 2000, astro-ph/0002104.

[51]  Gerhard O. Michler,et al.  Report on the retrodigitization project “Archiv der Mathematik” , 2001 .

[52]  E. L. Harder,et al.  The Institute of Electrical and Electronics Engineers, Inc. , 2019, 2019 IEEE International Conference on Software Architecture Companion (ICSA-C).