KitOps: Deploying Private LLMs on Kubernetes

Falco: Detect Security Threats In Real Time

July 1, 2025

The Kubernetes Bible: Deploy & Manage K8s accross Cloud Platforms

July 1, 2025

KitOps: Deploying Private LLMs on Kubernetes

In this episode of The Landscape, Shivay Lamba from Couchbase shares how the KitOps project is helping teams deploy open-source large language models (LLMs) on Kubernetes while maintaining privacy and control. KitOps, now a CNCF sandbox project, introduces a way to package ML components like models, weights, and data using OCI-compliant artifacts, making them easier to deploy using existing container tooling. Shivay explains how the new “model kit” format allows ML projects to follow cloud-native packaging practices and how AI gateways and GPU scheduling tools like Kue are shaping secure and scalable on-prem AI infrastructure.

What you will learn in this episode:

KitOps simplifies ML deployments with OCI artifacts: Machine learning components can be packaged and deployed using the same workflows as container images.
Model kit introduces a container-like format for ML stacks: Data, model weights, and parameters can be bundled for consistent deployment across environments.
Privacy and portability are central to private LLMs: Teams can deploy AI workloads on-prem or in private clusters without exposing sensitive data.
The ecosystem around AI gateways and GPU management is growing: Projects like Kue are addressing hardware-level scheduling challenges for AI workloads.
KitOps is early-stage and open to contributors: Developers interested in Golang and cloud-native ML can get involved while the project is still maturing.