Memory Requirements for Finetuning Large Language Models
Memory Requirements for Finetuning Large Language Models
💾Memory Requirements for Finetuning Large Language Models
Finetuning a 70B parameter model comes with serious memory demands: 1️⃣ Base weights: 140 GB (BF16) 2️⃣ Gradients: +140 GB 3️⃣ Optimizer states (Adam): +280 GB 4️⃣ Activations: Variable, can easily add 100s of GB
🧮Conclusion: Full Fine-Tuning requires a ton of GPUs e.g. H100s.
Solution: ✅ LoRA: drastically reduces memory by freezing base weights and training small low-rank adapters. ✅ QLoRA: Combine LoRA with quantization to shrink weight memory, making it possible to finetune on a single H100.
Resources and Further Reading
- Original LinkedIn Post
- LoRA Paper: Low-Rank Adaptation of Large Language Models
- QLoRA Paper: Efficient Finetuning of Quantized LLMs
- Connect with me on LinkedIn for more AI engineering discussions
This article accompanies the LinkedIn post about memory requirements for finetuning large language models.