Large Scale Mask Optimization Via Convolutional Fourier Neural Operator and Litho-Guided Self Training