Across-Stack Profiling and Characterization of Machine Learning Models on GPUs