ZenFlow: DeepSpeed的无停顿卸载训练引擎,5倍加速ZeRO-Offload

ZenFlow: DeepSpeed的无停顿卸载训练引擎,5倍加速ZeRO-Offload ArXiv ID: 2505.12242作者: Tingfeng Lan, Yusen Wu, Bin Ma, Zhaoyuan Su, Rui Yang, Tekin Bicer, Masahiro Tanaka, Olatunji Ruwase, Dong Li, Yue Cheng机构: University of Virginia, UC Merced, Argonne National Laboratory, Microsoft DeepSpeed Team发布日期: 2025-05-18 GPU卸载的14倍减速困境当GPU显存不足以容纳整个模型时,将部分模型状态卸载到CPU内存是常见解决方案。但ZeRO-Offload的代价巨大——Llama 2-7B在4张A100上:无卸载每步...

阅读全文

© 2026 Generative AI Discovery All Rights Reserved.
Theme by hiero