One of the key challenges in Large Language Model (LLM) training is reducing the memory requirements needed for training without sacrificing compute/communication efficiency and model accuracy. DeepSpeed [2] is a popular deep learning software library which facilitates memory-efficient training of large language models. DeepSpeed includes ZeRO (Zero Redundancy Optimizer), a memory-efficient approach for distributed training [5]. ZeRO has multiple stages of memory efficient optimizations, and Habana’s SynapseAI® software currently supports ZeRO-1 and ZeRO-2. In this article, we will talk about what ZeRO is and how it is useful for training LLMs. We will provide a brief technical overview of ZeRO, covering ZeRO-1 and ZeRO-2 stages of memory optimization. More details on DeepSpeed Support on Habana SynapseAI Software can be found at Habana DeepSpeed User Guide. Now, let us dive into why we need memory efficient training for LLMs and how ZeRO can help achieve this.
-
Articles récents
- Beewant’s Multimodal AI: Smarter Solutions for Training, Travel, and Safety
- Get Your Innovation to Go with Innovation Select Videos
- Building AI for Low-Resource Languages: Bezoku’s Innovative Approach
- Accelerate PyTorch* Inference with torch.compile on Windows* CPU
- DubHacks’24 Hackathon Where Developers Innovatively Utilized Intel® Tiber™ AI Cloud and AI PCs
-
Neural networks news
Intel NN News
- Beewant’s Multimodal AI: Smarter Solutions for Training, Travel, and Safety
Beewant’s cutting-edge multimodal AI redefines multimedia, driving innovative applications across […]
- Get Your Innovation to Go with Innovation Select Videos
Catch up on the latest Intel Innovation developer and technical content with demos, tech talks and […]
- Building AI for Low-Resource Languages: Bezoku's Innovative Approach
Bezoku, a member of the Intel® Liftoff program, is addressing the challenges of low-resource […]
- Beewant’s Multimodal AI: Smarter Solutions for Training, Travel, and Safety