Training Large Language Models - From Pretraining to Efficient Finetuning

January 1, 1 · 189 words · One minute · events

Training Large Language Models: Efficient Methods from Pretraining to Deployment

Ever wondered how ChatGPT was trained? Or how companies can adapt massive language models on consumer hardware? This workshop dives into the cutting-edge techniques that make modern AI training efficient and accessible.

Prerequisites

Understanding of transformer architecture
Basic knowledge of deep learning training concepts
Familiarity with PyTorch is helpful but not required

What You’ll Learn

The Complete Training Pipeline:
Pretraining objectives and strategies
Efficient finetuning techniques
Making models deployable through quantization
Reinforcement Learning from Human Feedback (RLHF)
Parameter-Efficient Methods:
Low-Rank Adaptation (LoRA)
QLoRA and quantization-aware training
Prompt tuning and soft prompts
The trade-offs between different approaches
Training at Scale:
Data Parallelism (DP)
Tensor Parallelism (TP)
Fully Sharded Data Parallelism (FSDP)
ZeRO optimization stages
Memory management techniques
Practical Implementations:
Setting up efficient training pipelines
Best practices for different scales of models
Common pitfalls and how to avoid them

By Workshop’s End

You’ll gain the ability to:

Understand the full lifecycle of LLM training
Choose appropriate training strategies for your compute budget
Implement efficient finetuning techniques
Scale training across multiple GPUs effectively

Ready to master modern LLM training techniques? Workshop Link

Welcome Tea

Training Large Language Models: Efficient Methods from Pretraining to Deployment

Prerequisites

What You’ll Learn

By Workshop’s End

Related Resources