Training Large Language Models - From Pretraining to Efficient Finetuning
January 1, 1 · 189 words · One minute · events
Training Large Language Models: Efficient Methods from Pretraining to Deployment
Ever wondered how ChatGPT was trained? Or how companies can adapt massive language models on consumer hardware? This workshop dives into the cutting-edge techniques that make modern AI training efficient and accessible.
Prerequisites
- Understanding of transformer architecture
- Basic knowledge of deep learning training concepts
- Familiarity with PyTorch is helpful but not required
What You’ll Learn
The Complete Training Pipeline:
Pretraining objectives and strategies
Efficient finetuning techniques
Making models deployable through quantization
Reinforcement Learning from Human Feedback (RLHF)
Parameter-Efficient Methods:
Low-Rank Adaptation (LoRA)
QLoRA and quantization-aware training
Prompt tuning and soft prompts
The trade-offs between different approaches
Training at Scale:
Data Parallelism (DP)
Tensor Parallelism (TP)
Fully Sharded Data Parallelism (FSDP)
ZeRO optimization stages
Memory management techniques
Practical Implementations:
Setting up efficient training pipelines
Best practices for different scales of models
Common pitfalls and how to avoid them
By Workshop’s End
You’ll gain the ability to:
- Understand the full lifecycle of LLM training
- Choose appropriate training strategies for your compute budget
- Implement efficient finetuning techniques
- Scale training across multiple GPUs effectively
Ready to master modern LLM training techniques? Workshop Link