High-Performance and Flexible Parallel Algorithms for Semisort and Related Problems