One of the key challenges in Large Language Model (LLM) training is reducing the memory requirements needed for training without sacrificing compute/communication efficiency and model accuracy. DeepSpeed [2] is a popular deep learning software library which facilitates memory-efficient training of large language models. DeepSpeed includes ZeRO (Zero Redundancy Optimizer), a memory-efficient approach for distributed training [5]. ZeRO has multiple stages of memory efficient optimizations, and Habana’s SynapseAI® software currently supports ZeRO-1 and ZeRO-2. In this article, we will talk about what ZeRO is and how it is useful for training LLMs. We will provide a brief technical overview of ZeRO, covering ZeRO-1 and ZeRO-2 stages of memory optimization. More details on DeepSpeed Support on Habana SynapseAI Software can be found at Habana DeepSpeed User Guide. Now, let us dive into why we need memory efficient training for LLMs and how ZeRO can help achieve this.
-
-
Articles récents
- Intel Brings the Future of Retail to Life at Cisco Live in San Diego
- Building Agentic Systems for Preventative Healthcare with AutoGen
- Making Vector Search Work Best for RAG
- GenAI-driven Music Composer Chorus.AI: Developer Spotlight
- Tangible Immersion: How Intel Labs Programs Cobots Using Haptic Mixed Reality
-
Neural networks news
Intel NN News
- Intel Brings the Future of Retail to Life at Cisco Live in San Diego
At Cisco Live 2025 in San Diego, Intel is redefining what’s possible for the retail industry.
- Building Agentic Systems for Preventative Healthcare with AutoGen
This blog demonstrates the preventative healthcare outreach agentic system built using AutoGen.
- Making Vector Search Work Best for RAG
This blog in the series on Scalable Vector Search summarizes insights from our study on optimizing […]
- Intel Brings the Future of Retail to Life at Cisco Live in San Diego
-