Deploy Model to KServe
Install Required Dependencies
- Install Docker Desktop
- Try to run
docker ps
- If you get a permissions error, follow instructions here
- Try to run
- Install KServe:
Define required variables
There are some environment variables that must be defined for KServe to work:
- INTERFACE: kserve
- HTTP_PORT: port where kserve will be running
- PROTOCOL: it can be v1 or v2
- MODEL_NAME: a name for the model must be defined
Deploy the model
For this tutorial, we will use the Chassis-generated container image uploaded as bmunday131/sklearn-digits
. To deploy to KServe, we will use the file that defines the InferenceService
for the protocol v1 of KServe.
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: chassisml-sklearn-demo
spec:
predictor:
containers:
- image: bmunday131/sklearn-digits:0.0.1
name: chassisml-sklearn-demo-container
imagePullPolicy: IfNotPresent
env:
- name: INTERFACE
value: kserve
- name: HTTP_PORT
value: "8080"
- name: PROTOCOL
value: v1
- name: MODEL_NAME
value: digits
ports:
- containerPort: 8080
protocol: TCP
MODEL_NAME
should not be necessary since it is defined when creating the image. This should output a success message.
Deploy from Private Docker Registry
In the above example, we deploy a public container image, which means we do not need to define credentials to pull the image. If, however, you set up Chassis to push container images to a private registry, you will need to add a few lines to your yaml file.
First, create a Kubernetes imagePullSecrets
object that contains your credentials as a list of secrets.
kubectl create secret docker-registry <registry-credential-secrets> \
--docker-server=<private-registry-url> \
--docker-email=<private-registry-email> \
--docker-username=<private-registry-user> \
--docker-password=<private-registry-password>
Visit Managing Secrets using kubectl for more details.
Next, add the following lines to your yaml file:
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: chassisml-sklearn-demo
spec:
predictor:
imagePullSecrets:
- name: <registry-credential-secrets>
containers:
- image: bmunday131/sklearn-digits:0.0.1
name: chassisml-sklearn-demo-container
imagePullPolicy: IfNotPresent
env:
- name: INTERFACE
value: kserve
- name: HTTP_PORT
value: "8080"
- name: PROTOCOL
value: v1
- name: MODEL_NAME
value: digits
ports:
- containerPort: 8080
protocol: TCP
Finally, apply your changes:
Define required variables to query the pod
This is needed in order to be able to communicate with the deployed image.
The SERVICE_NAME
must match the name defined in the metadata.name
of the InferenceService
created above.
The MODEL_NAME
must match the name of your model. It can be defined by the data scientist when making the request against Chassis
service or overwritten in the InferenceService
as defined above.
Mac:
Linux:
export INGRESS_HOST=$(minikube ip)
export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}')
Mac or Linux:
export SERVICE_NAME=chassisml-sklearn-demo
export MODEL_NAME=digits
export SERVICE_HOSTNAME=$(kubectl get inferenceservice ${SERVICE_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
Query the model
Please note that you must base64 encode each input instance. For example:
import json
import base64 as b64
instances = [[1,2,3,4],[5,6,7,8]]
input_dict = {'instances': [b64.b64encode(str(entry).encode()).decode() for entry in instances]}
json.dump(input_dict,open('kserve_input.json','w'))
Now you can just make a request to predict some data. Take into account that you must download inputsv1.json
before making the request.
curl -H "Host: ${SERVICE_HOSTNAME}" "http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict" -d@inputsv1.json | jq
The output should be similar to this:
{
"predictions": [
{
"data": {
"drift": null,
"explanation": null,
"result": {
"classPredictions": [
{
"class": "4",
"score": "1"
}
]
}
}
},
{
"data": {
"drift": null,
"explanation": null,
"result": {
"classPredictions": [
{
"class": "8",
"score": "1"
}
]
}
}
},
{
"data": {
"drift": null,
"explanation": null,
"result": {
"classPredictions": [
{
"class": "8",
"score": "1"
}
]
}
}
},
{
"data": {
"drift": null,
"explanation": null,
"result": {
"classPredictions": [
{
"class": "4",
"score": "1"
}
]
}
}
},
{
"data": {
"drift": null,
"explanation": null,
"result": {
"classPredictions": [
{
"class": "8",
"score": "1"
}
]
}
}
}
]
}
In this case, the data was prepared for the protocol v1, but we can deploy the image using the protocol v2 and make the request using the data for v2.
Deploy the model locally
The model can also be deployed locally:
docker run --rm -p 8080:8080 \
-e INTERFACE=kserve \
-e HTTP_PORT=8080 \
-e PROTOCOL=v2 \
-e MODEL_NAME=digits \
carmilso/chassisml-sklearn-demo:latest
So we can query it this way. Take into account that you must download inputsv2.json
before making the request:
Tutorial in Action
Follow along as we walk through this tutorial step by step!