Yuxiong He
发表
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
Ammar Ahmad Awan,
Samyam Rajbhandari,
Jeff Rasley,
2022
.