BIBTEX-based dataset generation for training citation parsers