Revisiting Transformer-based Models for Long Document Classification