Ventu API¶
-
class
ventu.ventu.Ventu(req_schema, resp_schema, use_msgpack=False, *args, **kwargs)[source]¶ Ventu: built for deep learning model serving
- Parameters
req_schema – request schema defined with
pydantic.BaseModelresp_schema – response schema defined with
pydantic.BaseModeluse_msgpack (bool) – use msgpack for serialization or not (default: JSON)
args –
kwargs –
To create a model service, inherit this class and implement:
preprocess(optional)postprocess(optional)inference(for standalone HTTP service)batch_inference(when working with batching service)
-
property
app¶ Falcon application with SpecTree validation
-
batch_inference(batch)[source]¶ batch inference the preprocessed data
- Parameters
batch – a list of data after
preprocess- Returns
a list of inference results
-
health_check(batch=False)[source]¶ health check for model inference (can also be used to warm-up)
- Parameters
batch (bool) – batch inference or single inference (default)
- Return bool
Trueif passed health check
-
inference(data)[source]¶ inference the preprocessed data
- Parameters
data – data after
preprocess- Returns
inference result
-
postprocess(data)[source]¶ postprocess the inference result
- Parameters
data – data after
inferenceor one item of thebatch_inference- Returns
as defined in
resp_schema
-
preprocess(data)[source]¶ preprocess the data
- Parameters
data – as defined in
req_schema- Returns
this will be the input data of
inferenceor one item of the input data ofbatch_inference
-
run_http(host=None, port=None)[source]¶ run the HTTP service
- Parameters
host (string) – host address
port (int) – service port
-
run_tcp(host=None, port=None)[source]¶ run as an inference worker with TCP
- Parameters
host (string) – host address
port (int) – service port
-
run_unix(addr=None)[source]¶ run as an inference worker with Unix domain socket
- Parameters
addr (string) – socket file address
-
property
sock¶ socket used for communication with batching service
this is a instance of
ventu.protocol.BatchProtocol
Config¶
Check pydantic.BaseSettings
-
class
ventu.config.Config(_env_file: Optional[Union[pathlib.Path, str]] = '<object object>', *, name: str = 'Deep Learning Service', version: str = 'latest', host: str = 'localhost', port: ventu.config.ConstrainedIntValue = 8000, socket: str = 'batching.socket')[source]¶ default config, can be rewrite with environment variables begin with
ventu_- Variables
name – default service name shown in OpenAPI
version – default service version shown in OpenAPI
host – default host address for the HTTP service
port – default port for the HTTP service
socket – default socket file to communicate with batching service
Protocol¶
-
class
ventu.protocol.BatchProtocol(infer, req_schema, resp_schema, use_msgpack)[source]¶ protocol used to communicate with batching service
- Parameters
infer – model infer function (contains preprocess, batch_inference and postprocess)
req_schema – request schema defined with pydantic
resp_schema – response schema defined with pydantic
use_msgpack (bool) – use msgpack for serialization or not (default: JSON)
-
process(conn)[source]¶ process batch queries and return the inference results
- Parameters
conn – socket connection
HTTP service¶
-
class
ventu.service.ServiceStatus(*, inference: ventu.service.StatusEnum, service: ventu.service.StatusEnum = <StatusEnum.ok: 'OK'>)[source]¶ service health status
-
ventu.service.create_app(infer, metric_registry, health_check, req_schema, resp_schema, use_msgpack, config)[source]¶ create
falconapplication- Parameters
infer – model infer function (contains preprocess, inference, and postprocess)
metric_registry – Prometheus metric registry
health_check – model health check function (need examples provided in schema)
req_schema – request schema defined with
pydantic.BaseModelresp_schema – request schema defined with
pydantic.BaseModeluse_msgpack (bool) – use msgpack for serialization or not (default: JSON)
config – configs
ventu.config.Config
- Returns
a
falconapplication