Ventu API¶

class ventu.ventu.Ventu(req_schema, resp_schema, use_msgpack=False, *args, **kwargs)[source]¶

Ventu: built for deep learning model serving

Parameters

req_schema – request schema defined with pydantic.BaseModel
resp_schema – response schema defined with pydantic.BaseModel
use_msgpack (bool) – use msgpack for serialization or not (default: JSON)
args –
kwargs –

To create a model service, inherit this class and implement:

preprocess (optional)

postprocess (optional)

inference (for standalone HTTP service)

batch_inference (when working with batching service)

property app¶: Falcon application with SpecTree validation

batch_inference(batch)[source]¶

batch inference the preprocessed data

Parameters: batch – a list of data after preprocess
Returns: a list of inference results

health_check(batch=False)[source]¶

health check for model inference (can also be used to warm-up)

Parameters: batch (bool) – batch inference or single inference (default)
Return bool: True if passed health check

inference(data)[source]¶

inference the preprocessed data

Parameters: data – data after preprocess
Returns: inference result

postprocess(data)[source]¶

postprocess the inference result

Parameters: data – data after inference or one item of the batch_inference
Returns: as defined in resp_schema

preprocess(data)[source]¶

preprocess the data

Parameters: data – as defined in req_schema
Returns: this will be the input data of inference or one item of the input data of batch_inference

run_http(host=None, port=None)[source]¶

run the HTTP service

Parameters

host (string) – host address
port (int) – service port

run_tcp(host=None, port=None)[source]¶

run as an inference worker with TCP

Parameters

host (string) – host address
port (int) – service port

run_unix(addr=None)[source]¶

run as an inference worker with Unix domain socket

Parameters: addr (string) – socket file address

property sock¶

socket used for communication with batching service

this is a instance of ventu.protocol.BatchProtocol

Config¶

Check pydantic.BaseSettings

class ventu.config.Config(_env_file: Optional[Union[pathlib.Path, str]] = '<object object>', *, name: str = 'Deep Learning Service', version: str = 'latest', host: str = 'localhost', port: ventu.config.ConstrainedIntValue = 8000, socket: str = 'batching.socket')[source]¶

default config, can be rewrite with environment variables begin with ventu_

Variables

name – default service name shown in OpenAPI
version – default service version shown in OpenAPI
host – default host address for the HTTP service
port – default port for the HTTP service
socket – default socket file to communicate with batching service

Protocol¶

class ventu.protocol.BatchProtocol(infer, req_schema, resp_schema, use_msgpack)[source]¶

protocol used to communicate with batching service

Parameters

infer – model infer function (contains preprocess, batch_inference and postprocess)
req_schema – request schema defined with pydantic
resp_schema – response schema defined with pydantic
use_msgpack (bool) – use msgpack for serialization or not (default: JSON)

process(conn)[source]¶

process batch queries and return the inference results

Parameters: conn – socket connection

run(addr, protocol='unix')[source]¶

run socket communication

this should run after the socket file is created by the batching service

Parameters

protocol (string) – ‘unix’ or ‘tcp’
addr – socket file path or (host:str, port:int)

stop()[source]¶: stop the socket communication

HTTP service¶

class ventu.service.ServiceStatus(*, inference: ventu.service.StatusEnum, service: ventu.service.StatusEnum = <StatusEnum.ok: 'OK'>)[source]¶: service health status

class ventu.service.StatusEnum(value)[source]¶: An enumeration.

ventu.service.create_app(infer, metric_registry, health_check, req_schema, resp_schema, use_msgpack, config)[source]¶

create falcon application

Parameters

infer – model infer function (contains preprocess, inference, and postprocess)
metric_registry – Prometheus metric registry
health_check – model health check function (need examples provided in schema)
req_schema – request schema defined with pydantic.BaseModel
resp_schema – request schema defined with pydantic.BaseModel
use_msgpack (bool) – use msgpack for serialization or not (default: JSON)
config – configs ventu.config.Config

Returns

a falcon application