omi
OMIClient
Provides a convenient client for interacting with a model running the OMI (Open Model Interface) server.
The API for this object is asynchronous like the underlying gRPC grpclib
client.
Attributes:
Name | Type | Description |
---|---|---|
client | Access to the underlying |
Example
status async
Queries the model to get its status.
The first time this method is called it will also initialize the model, giving it the opportunity to load any assets or perform any setup required to perform inferences.
Returns:
Type | Description |
---|---|
StatusResponse | The status of the model. |
run async
Perform an inference.
The inputs
parameter represents a batch of inputs to send to the model. If the model supports batch, then it will process the inputs in groups according to its batch size. If the model does not support batch, then it will loop through each input and process it individually.
Each input in inputs
is a dictionary that allows multiple pieces of data to be supplied to the model for each discrete inference. The key should match the key name expected by the model (e.g. the first value supplied in the ChassisModel.metadata.add_input method) and the value should always be of type bytes
. The bytes should be decodable using one of the model's declared media type for that key (e.g. the accepted_media_types
argument in ChassisModel.metadata.add_input).
To enable drift detection and/or explainability on models that support it, you can set the appropriate parameters to True
.
In the RunResponse
object, you will be given a similar structure to the inputs. The outputs
property will be an array that corresponds to the batch of inputs. The index of each item in outputs
will be the inference result for the corresponding index in the inputs
array. And similarly to inputs, each inference result will be a dictionary that can return multiple pieces of data per inference. The key and media type of the bytes value should match the values supplied in ChassisModel.metadata.add_input).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs | Sequence[Mapping[str, bytes]] | The batch of inputs to supply to the model. See above for more information. | required |
detect_drift | bool | Whether to enable drift detection on models that support it. | False |
explain | bool | Whether to enable explainability on models that support it. | False |
Returns:
Type | Description |
---|---|
RunResponse | See above for more details. |
shutdown async
Tells the model to shut itself down. The container will immediately shut down upon receiving this call.
test_container async
classmethod
test_container(container_name, inputs, tag='latest', port=45000, timeout=10, pull=True, detect_drift=False, explain=False)
Tests a container. This method will use your local Docker engine to spin up the named container, perform an inference against it with the given inputs
, and return the result.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
container_name | str | The full name of the container without the tag. | required |
inputs | Sequence[Mapping[str, bytes]] | A batch of input(s) to perform inference on. See chassis.client.OMIClient.run for more information. | required |
tag | str | The tag of the image to test. | 'latest' |
port | int | The port on the host that the container should map to. | 45000 |
timeout | int | A timeout in seconds to wait for the model container to become available. | 10 |
pull | bool | Whether to pull the image if it doesn't exist in your local image cache. | True |
detect_drift | bool | Whether to enable drift detection on models that support it. | False |
explain | bool | Whether to enable explainability on models that support it. | False |
Returns:
Type | Description |
---|---|
Optional[RunResponse] | See chassis.client.OMIClient.run for more information. |