论文信息 - Statistical Computing and Databases: Distributed Computing Near the Data

Statistical Computing and Databases: Distributed Computing Near the Data

This paper addresses the following question: “how do we fit statistical models efficiently with very large data sets that reside in databases?” Nowadays it is quite common to we encounter a situation where a very large data set is stored in a database, yet the statistical analysis is performed with a separate piece of software such as R. Usually it does not make much sense and in some cases it may not even be possible to move the data from the database manager into the statistical software in order to complete a statistical

Brian D. Ripley | Fei Chen

[1] Amy Braverman,et al. What Shall We Do with the Data We are Expecting from Upcoming Earth Observation Satellites , 1999 .

[2] Jack Dongarra,et al. ScaLAPACK Users' Guide , 1987 .

[3] Jack Dongarra,et al. PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing , 1995 .

[4] Paul DuBois,et al. MySQL Reference Manual , 2002 .