论文信息 - DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale - 字舞流文

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale

Reza Yazdani Aminabadi | Samyam Rajbhandari | Jeff Rasley | Yuxiong He | Minjia Zhang | A. A. Awan | Conglong Li | Z. Yao | A. Awan