In this episode of The Landscape, Shivay Lamba from Couchbase shares how the KitOps project is helping teams deploy open-source large language models (LLMs) on Kubernetes while maintaining privacy and control. KitOps, now a CNCF sandbox project, introduces a way to package ML components like models, weights, and data using OCI-compliant artifacts, making them easier to deploy using existing container tooling. Shivay explains how the new “model kit” format allows ML projects to follow cloud-native packaging practices and how AI gateways and GPU scheduling tools like Kue are shaping secure and scalable on-prem AI infrastructure.
What you will learn in this episode:
- KitOps simplifies ML deployments with OCI artifacts: Machine learning components can be packaged and deployed using the same workflows as container images.
- Model kit introduces a container-like format for ML stacks: Data, model weights, and parameters can be bundled for consistent deployment across environments.
- Privacy and portability are central to private LLMs: Teams can deploy AI workloads on-prem or in private clusters without exposing sensitive data.
- The ecosystem around AI gateways and GPU management is growing: Projects like Kue are addressing hardware-level scheduling challenges for AI workloads.
- KitOps is early-stage and open to contributors: Developers interested in Golang and cloud-native ML can get involved while the project is still maturing.