Audio Language Modeling using Perceptually-Guided Discrete Representations