Document classification with supervised latent feature selection

The classification of text documents to categories generally deals with large dimensionality of a structured representation of the documents. To favor generality over accuracy of the classifier some dimensionality reduction technique has to be applied. In the text we present classification algorithm that utilize hidden structures of uncorrelated topics extracted from training documents and their known categories not necessarily independent. The classifier is capable to include various methods of hidden feature selection. Three latent feature selection procedures are proposed and tested.