Delving into VoxCeleb: environment invariant speaker recognition