MLOps for LLMs - From Development to Production

November 1, 2024 · 136 words · One minute · events

MLOps for LLMs: From Prototyping to Production

Ever wondered how to quickly prototype AI applications, scale them efficiently, and monitor their performance? This workshop covers the complete MLOps lifecycle, from rapid UI development to production deployment and monitoring.

Prerequisites

  • Basic understanding of LLMs and transformers
  • Experience with Python and basic DevOps concepts
  • Familiarity with REST APIs
  • Previous workshops in the series recommended

What You’ll Learn

  • Efficient LLM Serving with vLLM:

  • Recap on inference pipelinesa

  • Idea of KV Cache

  • PagedAttention for efficient KV cache

  • Block management and memory usage

  • Monitoring/Logging:

  • Experiment tracking setup

  • Custom metric logging

  • Rapid Prototyping with Gradio:

  • Building interactive UI components

  • Sharing and collaboration

By Workshop’s End

You’ll gain the ability to:

  • Rapidly prototype with Gradio
  • Deploy efficiently with vLLM
  • Monitor comprehensively with python logging

Ready to master the complete MLOps lifecycle? Workshop Link