Knative: build Serverless and Event Driven Applications

Flux: GitOps Delivery Solution

October 10, 2024

Road to KubeCon NA 2024: Whitney Lee

October 17, 2024

Knative: build Serverless and Event Driven Applications

In this episode of The Landscape, Bart and Sylvain chat with Calum Murray, a Knative contributor, about how Knative helps developers simplify building and deploying serverless applications on Kubernetes. Knative offers robust solutions for scaling, event-driven architecture, and function-as-a-service, all tailored for Kubernetes environments. Knative is an incubating project of the CNCF landscape.

In this episode, you will learn about:

Knative’s key components: Knative Serving, Eventing, and Functions—each designed to solve specific problems in Kubernetes application development, from auto-scaling containers to simplifying complex event communication between services.
Knative’s real-world impact: IBM Watson X Assistant saw a 60% reduction in training time by adopting Knative Eventing for its machine learning workflows, a perfect example of how Knative optimizes asynchronous processing.
Enhanced event filtering capabilities: Knative Eventing’s powerful filtering options, like cloud event SQL and prefix matching, allow for precise control over event routing, enabling more sophisticated event-driven applications.
When not to use Knative: Calum explains that for long-running or stateful workloads, Knative’s scaling capabilities might not be ideal. Additionally, workloads that don’t use HTTP as a metric for scaling may benefit from other tools.
How Knative fits into AI pipelines: Learn how Knative helps scale AI models after training and supports asynchronous workflows, making it an invaluable tool for AI-driven applications.

Read the transcript

Bart Farrell (00:02.415)
Okay, Calum, welcome to The Landscape. Let’s just jump right in. What problem does the project Knative solve?

Calum Murray (00:11.468)
Knative tries to solve the question of how do you build an application on top of Kubernetes? And we like to describe Knative as Kubernetes for developers. It’s got three different components, each of which solves different parts of the problem of building an application on top of Kubernetes. The first part is Knative Serving. This is often what people think about when they think about Knative. It’s about how you run a container. Specifically, when you’re running a container on Kubernetes, you usually want more than just running the container. You want a lot of other features, so Knative provides auto-scaling based on HTTP traffic. If no one’s using your container, there’s no container running. When HTTP traffic comes in, Knative will spin up the pod, and once the pod is ready, it will pass the traffic on. So there’s no dropped connections along the way. Beyond that, it can do traffic splitting between different versions of your app if you’re trying to do progressive rollouts. It also makes it easier to create the Knative service, unlike when you’re trying to make a basic deployment in Kubernetes where you need to create your deployment, your service, and your ingress resource. With Knative, you just create a Knative service resource and all of that gets created for you.

Then there’s Knative Eventing, which focuses more on communication between multiple Knative services or even normal deployments in Kubernetes. Oftentimes, once you get past three or four microservices, it gets messy trying to figure out communication. Knative Eventing lets you configure how that communication should happen as Kubernetes resources, so you can separate out where your code is from how the data should be transmitted. You make an event broker, subscribe to different events in your container, and your code only needs to worry about what events to receive and send. These events are just HTTP requests with a couple of extra headers, following the CNCF Cloud Event specification. This makes it a lot easier to handle communication between many services, and it also gives you built-in features like retries. If an event doesn’t process, Knative retries it. It can handle retries and offers built-in authorization and authentication features.

Then, the third major part is Knative Functions, which brings the function-as-a-service experience to Kubernetes, with the added benefit of no runtime libraries required. You can write Go code, Python, or JavaScript, and then you type func build, which builds a container, and func deploy to deploy it into your cluster as a Knative service.

Sylvain Kalache (02:47.861)
Great.

Calum Murray (02:48.074)
So when you put these three parts together, you get a comprehensive application platform.

Sylvain Kalache (02:55.518)
Can you speak a little bit about the types of use cases or products that end users are using Knative for?

Calum Murray (03:07.872)
Yeah, for sure. The simplest use case is any kind of web app you want to run on top of Kubernetes. Knative Serving is great at running web apps because it scales with HTTP traffic. So, that’s a very common use case: running web apps. We’ve also seen users building software-as-a-service platforms for deploying web apps, where each individual user’s web app is deployed as a Knative Service. Another common use case is AI and model serving. If you want a model that’s accessible via HTTP, auto-scaling is very useful. And for asynchronous processing, where multiple steps might take a while, Knative Eventing is really useful there. When one step is done, it sends an event that triggers the next step, so you don’t need your code to handle that asynchronous nature yourself.

Bart Farrell (04:03.513)
So I’ll be muted. I’ll be muted.

Sylvain Kalache (04:08.225)
As you said, Knative covers a lot of ground. It’s a project that can help in many ways. But if you had to pick one feature, your favorite feature, which one would it be?

Bart Farrell (04:08.781)
This is why we edit. Come on, go for it.

Calum Murray (04:32.248)
For me, it’s the enhanced trigger filtering. When you’re building with some kind of event-driven platform, the key is that you’re sending events to your different services, deployments, pods, or even some external URL to your cluster. Doing that in a powerful way, where you can pick which events go where, is great. Knative recently had the GA for enhanced trigger filtering, which now lets you use other things like CloudEvents SQL or prefix and suffix matching instead of just matching on a single attribute. This allows you to make more complex logic in your filtering and build really cool apps.

Bart Farrell (05:17.337)
Good. Now, in terms of the best success or end-user stories, what’s one that’s really caught your attention that you’d like others to know about?

Calum Murray (05:26.999)
Yeah, recently there was a case study published on the CNCF website about how IBM Watson X Assistant used Knative Eventing. They used it to handle the training of their machine learning models. Watson X Assistant is a low-code platform where you describe a virtual agent, and it creates this agent for the user. They were orchestrating their training workloads and had some proprietary solution but then moved to Knative Eventing. They claimed it reduced training time by 60%, largely due to asynchronous processing. They could orchestrate things better with the asynchronous primitives. They also removed 40,000 lines of code and an entire microservice, which is a huge success story. Especially with AI becoming more popular, seeing it used for that kind of problem is very interesting.

Sylvain Kalache (06:26.465)
Yeah, and that paper said using Knative resulted in a 60% reduction in training time, which is huge. It helps with time-to-market and product iteration and also saves on resources. Very impressive. We will share the link to the study in the podcast and video descriptions.

Sylvain Kalache (07:28.052)
Now, Knative can be a central part of your infrastructure. Can you tell us a little bit more about which other CNCF landscape tools are integrating with Knative?

Calum Murray (07:28.052)
Sure. The first one is Kubernetes. Knative is Kubernetes-native. I think most CNCF landscape tools work with Kubernetes in some form, but Knative provides all these different building blocks for serverless event-driven architectures on top of Kubernetes. So, the big one is Kubernetes. In the eventing side, all events follow the CNCF CloudEvents specification and also use the CloudEvents SDKs. We have strong integration with that community and contribute frequently. For example, we worked with them for the CloudEvents SQL specification and its implementation. Knative Eventing has many different implementations that use various technologies to deliver events. Recently, in the Kafka implementation, which Red Hat supports, they integrated KEDA to do auto-scaling based on the lag in the Kafka topic, since that’s not HTTP, so Knative Serving can’t be used there. It’s interesting to see Knative working with KEDA for scaling, alongside Knative’s own scaling features. Also, nearly every Knative component exposes metrics, so you can do observability with Prometheus or similar tools.

Bart Farrell (08:46.465)
On the KEDA side, a big shout out to one of the maintainers, Jorge Turrado, who we’ll definitely have on this program at some point. Big fan of everything they’re doing. Next question, which companies are involved in supporting Knative? It’s nice to see individual contributors, but at a company level, what are some of the key players?

Calum Murray (09:07.286)
Yeah, currently the two main companies maintaining it are Red Hat and Broadcom VMware. Historically, it was originally a Google project, so there were a lot of Google contributions, especially early on. There are also IBM contributions, though fewer recently. So, it’s primarily Red Hat and Broadcom right now.

Bart Farrell (09:36.281)
While it’s great to talk about the marvelous uses of Knative, when would you say this project is not appropriate? When should it not be used?

Calum Murray (09:45.12)
I think it depends on which component you’re using, but starting with Knative Serving, you have to look at the workload. If you’ve got a long-running workload where a single request might take an hour to process, Knative Serving might not be the best fit. Knative Serving scales based on HTTP, and there’s a timeout on how long the request can stay open. When that timeout is reached, Knative will assume there’s no more traffic and scale back down. We sometimes see users trying to process for 45 minutes, but it stops after 30 minutes, which is the timeout they had configured. You can set the timeout to be really large, and Knative will respect that, but if your job takes longer, it might not be ideal. Another thing is stateful workloads. If there’s a lot of state that needs to be maintained within a container, scaling it to zero doesn’t make sense anymore. Knative Serving isn’t great for those stateful workloads. Also, anything where the scaling metric isn’t HTTP traffic. Going back to the Kafka example, we wanted to scale based on the Kafka topic lag, which

is why Knative Serving wouldn’t make sense there.

For Knative Eventing, if you need a synchronous response, it’s not the best fit. Knative Eventing does asynchronous event processing. You send an event, and it says “200 OK, we got it,” but it doesn’t send a response back immediately. So if you’re working with strict latency requirements, the extra network hop added by Knative Eventing may not work for you. However, there is a roadmap item coming up that would let you request a synchronous response, so that could change in six months.

Sylvain Kalache (12:17.957)
Thanks. So you spoke a little bit about AI earlier, but we’d like to zoom in more. AI is a hot topic in the tech industry right now, and everyone is trying to assemble the best pipeline. Can you tell us more about how Knative can be used in an AI pipeline?

Calum Murray (12:45.686)
Yeah, depending on what you’re trying to do in your AI pipeline, you’ll use different parts of Knative. For the pipeline specifically, you’ll often have different components that take variable amounts of time to process. You might have data transformations, then a training job, followed by post-training jobs or model validation. These steps are often slow, so using Knative Eventing for asynchronous processing makes a lot of sense. That’s where IBM saw a 60% reduction in training time—they were able to orchestrate things more effectively with real-time asynchronous processing.

Another place Knative is used frequently is after training, when you want to run a model and scale it based on HTTP traffic. You don’t want to run these models if no one’s using them, as they’re expensive to run. Knative Serving can scale these models down to zero, or even just down to one, to reduce costs. At KubeCon Chicago, a company talked about using Knative, but instead of scaling down to zero, they scaled down to one because their requests sometimes took a long time. Knative queued up the traffic so they wouldn’t drop any requests. If you’re using KServe, another CNCF project, Knative Serving is used internally for serverless model inference.

We’ve also been looking at building agents for language models. You want the language model to call external tools, and in the Knative community, we’ve been exploring how to use Knative Eventing to manage those tools. It’s a nice abstraction for the language model: just send an event, and you don’t need to worry about where it’s going. It could go to one or multiple destinations. The synchronous response feature we mentioned earlier is important for this, as it allows you to send an event and get a response back for the language model to move forward. This makes it easier to build language model agents on Kubernetes.

Sylvain Kalache (15:08.545)
Some of our audience may not know what “scale to zero” and “scale to one” mean. Could you briefly explain those terms?

Calum Murray (15:20.364)
Yeah, for sure. “Scale to zero” means there’s no container running—so the pod goes away. In Knative, if there’s no traffic, the pods don’t need to run. “Scale to one” means you set the minimum number of pods to one instead of zero. So Knative won’t completely turn off the pod; it will just scale it down to one. Normally, in Kubernetes, a horizontal pod autoscaler scales down to one and up from there. Knative lets you scale down all the way to zero when needed.

Bart Farrell (15:54.479)
Since you’ve been working on the project for a while, are there any shout-outs you’d like to give to people who have been instrumental in your growth, whether they’re other maintainers or contributors?

Calum Murray (16:04.992)
Yeah, a couple of people come to mind. Most of my contributions on the technical side have been within the eventing part of Knative, and I’d especially like to shout out Pierre Angelo and Christoph. They both spent a lot of time mentoring me and helping me get started when I joined the project, and they continued to support me throughout my time contributing. I’ve also been involved in various community initiatives, like reviving the UX working group within the community, and Ali, who is on the steering committee, was really helpful. He showed me how to run community initiatives and who to talk to for different issues.

Bart Farrell (16:40.399)
Well-known community leader—no surprise getting a shout-out on this podcast! Much respect to Ali, as well as the other folks you mentioned. In terms of the next steps for Knative, what’s on the project roadmap that you’re excited about?

Calum Murray (16:59.752)
I think I mentioned it briefly earlier, but I’m most excited about the request-reply feature, which will allow you to get synchronous responses out of a Knative Eventing system. This is useful for AI use cases, where a language model calls out to tools, but it’s also useful if you want to send front-end requests into an event mesh and get a reply back instead of just receiving a 200-accepted message. If you’re building a web app, it can be annoying to track replies using websockets, so this feature will make things easier. I’m really excited to see it land soon.

Bart Farrell (17:43.535)
With everything you’ve shared, it’s clear that Knative is a thriving, dynamic project. For people out there interested in getting involved, how can they contribute? What kinds of contributors are you looking for right now?

Calum Murray (17:56.47)
The best way to get involved is to join the CNCF Slack. There are a whole bunch of Knative channels. Join any of them, ask around, say hi. I recommend Slack because sometimes you might want more context than what’s in GitHub issues. Maybe you have an idea for something and want to discuss it—Slack is usually the fastest way to get an answer. We also have weekly or bi-weekly meetings for the various working groups. You can ask about them on Slack or find them on the Knative website.

If you want to contribute code, you can go to GitHub and look for issues labeled “help wanted.” Knative is a bit harder to contribute to than some other projects, in my opinion, but it’s not impossible. I was a student when I started and eventually became a maintainer. If you’re willing to learn and put in some effort, you can do it. However, it does go deep into Kubernetes internals and networking, so there’s a learning curve.

Bart Farrell (19:03.515)
It’s no secret that, of the three of us, you’re definitely the youngest. What do you do when you’re not working on Knative? I’m curious about how you balance your time because I see some sheet music behind you. For those listening to the podcast, it seems Calum is a musician. Can you tell us more about what you do outside the project?

Calum Murray (19:31.874)
Yeah, I play piano. I was classically trained throughout elementary and high school, and now I just play for fun. I don’t practice enough to perform anymore, but it’s a nice hobby. I also like to swim—I used to do competitive swimming. So, I swim, go to the gym to stay in shape, and hang out with my friends. I’m going back to school in the fall, so I’ll be balancing my classes with maintaining Knative too.

Bart Farrell (20:01.423)
Of course. For other young folks out there interested in getting involved in the CNCF, we often hear that people feel overwhelmed. You mentioned a learning curve. What would your advice be for those who are considering it but aren’t sure?

Calum Murray (20:20.024)
Yeah, it can definitely be overwhelming when you’re getting started. One of the classic overwhelming things is opening the CNCF landscape and seeing 200 different things, wondering where to start. Or you might go to a repository with thousands of files and issues. In my experience, just pick a project. It doesn’t need to be anything in particular—hopefully something you find interesting. When you’re new to the cloud space, you might not even know what you’re interested in yet. I picked Knative because I thought serverless was cool. I didn’t really know what it was, but it was a buzzword and sounded interesting. Once I was there, I focused on understanding what problems Knative was solving, and staying in one community for a while had a lot of value. You start to understand not just the problem the community is solving, but also adjacent problems and projects. Now I feel like I have a better foundation to go to another project, decide what it’s doing, and find it interesting. So my advice is to focus somewhere, stay there, learn, meet people, go to meetings, and ask questions. Don’t try to contribute to 10 or 20 projects at once. Just pick one to start.

Bart Farrell (21:39.939)
Love that. Keep it simple—no need to boil the ocean. Rome wasn’t built in a day. Calum, thank you for sharing your time and experience with us. For folks interested in learning more about Knative, it couldn’t be easier: jump into the Slack channel, attend a meeting, and ask questions. People will be ready to help you out. Hope we cross paths soon at KubeCon, KCD, or a Cloud Native Community Group somewhere in the world. Keep up the amazing work. Much appreciated.

Calum Murray (22:01.069)
Yeah.

**Calum Murray (22:07.202)**
Thanks for having me on.

Bart Farrell (22:08.665)
Pleasure. Take care.