A k-nearest neighbor based method for improving large scale biomedical document indexing