Investigating the Impact of Preprocessing on Document Embedding: An Empirical Comparison