Leveraging confidence models for identifying challenging data subgroups in speech models