Interfaces
Chassis containers currently support two interfaces: the Open Model Interface (OMI) and KServe v1. These interfaces provide a way to interact with containerized models, primarily for running inferences. When building a model container with Chassis, you can specify which of these two interfaces you wish to use. Below you'll find more information about how to interact with model containers using each type of interface provided.
OMI
The Open Model Interace is a specification for a multi-platform OCI-compliant container image designed specifically for machine learning models.
The OMI server provides a gRPC interface defined by OMI's protofile. This interface provides three remote procedure calls (RPCs) that are used to interact with a chassis contianer: Status
, Run
, and Shutdown
. The Run RPC is most important, as it provides a simple way to run inferences through models packaged into chassis container images.
RPC | Input | Description | Example Response |
---|---|---|---|
Status | None | Returns the status of a running model. | { "inputs": [...], "outputs": [...], "status_code": 200, "status": "OK", "message": "Model Initialized Successfully.", "model_info": { "model_name": "Digits Classifier", "model_version": "0.0.1", "model_author": "", "model_type": "grpc", "source": "chassis" }... } |
Run | A run request message which includes one or more key value pairs, each representing a single model input. | Submits data to a running model for inference and returns the model's results. | [{"data": {"result": {"classPredictions": [{"class": 5, "score": 0.71212}]}}}] |
Shutdown | None | Sends the model container a shutdown message. | { "status_code": 200, "status": "OK", "message": "Model Shutdown Successfully." } |
Best ways to work with OMI models
Because OMI models use gRPC rather than RESTful APIs, there are 3 ways to build applications that can interact with an OMI model:
- [RECOMMENDED] Via the OMI Python client which is automatically installed when you
pip install chassisml
- By building a language-specific client directly from the OMI protofile (best option for non-Python applications)
- By building a language-specific client using server reflection on a running chassis model container
KServe V1
KServe's V1 protocol offers a standardized prediction workflow across all model frameworks. This protocol version is still supported, but it is recommended that users migrate to the V2 protocol for better performance and standardization among serving runtimes. However, if a use case requires a more flexible schema than protocol v2 provides, v1 protocol is still an option.
API | Verb | Path | Request Payload | Response Payload |
---|---|---|---|---|
List Models | GET | /v1/models | {"models": [<model_name>]} | |
Model Ready | GET | /v1/models/ | {"name": <model_name>,"ready": $bool} | |
Predict | POST | /v1/models/ | {"instances": []} ** | {"predictions": []} |
Explain | POST | /v1/models/ | {"instances": []} ** | {"predictions": [], "explanations": []} |
See Kserve's documentation for full details: https://kserve.github.io/website/0.10/modelserving/data_plane/v1_protocol/