Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification