A latent semantic analysis framework for large-Span language modeling