Updating the partial singular value decomposition in latent semantic indexing

Latent semantic indexing (LSI) is a method of information retrieval (IR) that relies heavily on the partial singular value decomposition (PSVD) of the term-document matrix representation of a data set. Calculating the PSVD of large term-document matrices is computationally expensive; hence in the case where terms or documents are merely added to an existing data set, it is extremely beneficial to update the previously calculated PSVD to reflect the changes. It is shown how updating can be used in LSI to significantly reduce the computational cost of finding the PSVD without significantly impacting performance. Moreover, it is shown how the computational cost can be reduced further, again without impacting performance, through a combination of updating and folding-in.