Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts
暂无分享,去创建一个
Hyung Won Chung | Xinyun Chen | Trevor Darrell | K. Keutzer | W. Fedus | Barret Zoph | Denny Zhou | Hongkun Yu | Sheng Shen | Yuexin Wu | Nan Du | Jason Wei | S. Longpre | Albert Webson | Wuyang Chen | Vincent Zhao | Tu Vu | Yan-Quan Zhou | Le Hou | Yunxuan Li | Yunxuan Li