SMILES Pair Encoding: A Data-Driven Substructure Tokenization Algorithm for Deep Learning