Install

Dependencies

KServe does not need any other component to run per se since it's a standalone controller that leverages Kubernetes CRDs.

However if you're going to serve models that are stored in S3 you will need to naturally deploy S3 and configure its access.

Deploy using helmfile

Helmfile is prepared for kserve.

helmfile sync -f helmfile_ai.yaml.gotmpl -l app=model-serving

Configuration

You must configure access to S3 with a custom service account if that is where you're storing your models.

You can change the tags and repositories for serving runtime images, as well as security parameters.

You can enable optional components:

ModelMesh: platform for high-scale, high-density serving of machine learning models. It's designed to optimize model serving for scenarios with frequently changing models and high request volumes.
LocalModel is a feature in KServe that allows you to serve machine learning models directly from local storage within your Kubernetes cluster. This capability is particularly useful for quick deployment and testing of models without relying on external storage solutions.

Usage

KServe main function is making deployment of machine learning models easy. Provided everything is configured correctly, creating a simple Custom Resource InferenceService should be enough, this will instantiate an inference server serving your chosen model in a pod with the appropriate parameters.

We will outline several examples on how to deploy an inference service. All of them will have to be stored in S3.

Dependencies​

Deploy using helmfile​

Configuration​

Usage​

Dependencies

Deploy using helmfile

Configuration

Usage