The pitfalls of next-token prediction