Install
Dependencies​
KServe does not need any other component to run per se since it's a standalone controller that leverages Kubernetes CRDs.
However if you're going to serve models that are stored in S3 you will need to naturally deploy S3 and configure its access.
Deploy using helmfile​
Helmfile is prepared for kserve.
helmfile sync -f helmfile_ai.yaml.gotmpl -l app=model-serving
Configuration​
You must configure access to S3 with a custom service account if that is where you're storing your models.
You can change the tags and repositories for serving runtime images, as well as security parameters.
You can enable optional components:
- ModelMesh: platform for high-scale, high-density serving of machine learning models. It's designed to optimize model serving for scenarios with frequently changing models and high request volumes.
- LocalModel is a feature in KServe that allows you to serve machine learning models directly from local storage within your Kubernetes cluster. This capability is particularly useful for quick deployment and testing of models without relying on external storage solutions.
Usage​
KServe main function is making deployment of machine learning models easy. Provided everything is configured correctly, creating a simple Custom Resource
InferenceService should be enough, this will instantiate an inference server serving your chosen model in a pod with the appropriate parameters.
We will outline several examples on how to deploy an inference service. All of them will have to be stored in S3.