Training Large Language Models - From Pretraining to Efficient Finetuning

January 1, 1 · 189 words · One minute · events

Training Large Language Models: Efficient Methods from Pretraining to Deployment

Ever wondered how ChatGPT was trained? Or how companies can adapt massive language models on consumer hardware? This workshop dives into the cutting-edge techniques that make modern AI training efficient and accessible.

Prerequisites

  • Understanding of transformer architecture
  • Basic knowledge of deep learning training concepts
  • Familiarity with PyTorch is helpful but not required

What You’ll Learn

  • The Complete Training Pipeline:

  • Pretraining objectives and strategies

  • Efficient finetuning techniques

  • Making models deployable through quantization

  • Reinforcement Learning from Human Feedback (RLHF)

  • Parameter-Efficient Methods:

  • Low-Rank Adaptation (LoRA)

  • QLoRA and quantization-aware training

  • Prompt tuning and soft prompts

  • The trade-offs between different approaches

  • Training at Scale:

  • Data Parallelism (DP)

  • Tensor Parallelism (TP)

  • Fully Sharded Data Parallelism (FSDP)

  • ZeRO optimization stages

  • Memory management techniques

  • Practical Implementations:

  • Setting up efficient training pipelines

  • Best practices for different scales of models

  • Common pitfalls and how to avoid them

By Workshop’s End

You’ll gain the ability to:

  • Understand the full lifecycle of LLM training
  • Choose appropriate training strategies for your compute budget
  • Implement efficient finetuning techniques
  • Scale training across multiple GPUs effectively

Ready to master modern LLM training techniques? Workshop Link