CuWide: Towards Efficient Flow-based Training for Sparse Wide Models on GPUs