How to fight production incidents?: an empirical study on a large-scale cloud service