Ventu API¶
-
class
ventu.ventu.
Ventu
(req_schema, resp_schema, use_msgpack=False, *args, **kwargs)[source]¶ Ventu: built for deep learning model serving
- Parameters
req_schema – request schema defined with
pydantic.BaseModel
resp_schema – response schema defined with
pydantic.BaseModel
use_msgpack (bool) – use msgpack for serialization or not (default: JSON)
args –
kwargs –
To create a model service, inherit this class and implement:
preprocess
(optional)postprocess
(optional)inference
(for standalone HTTP service)batch_inference
(when working with batching service)
-
property
app
¶ Falcon application with SpecTree validation
-
batch_inference
(batch)[source]¶ batch inference the preprocessed data
- Parameters
batch – a list of data after
preprocess
- Returns
a list of inference results
-
health_check
(batch=False)[source]¶ health check for model inference (can also be used to warm-up)
- Parameters
batch (bool) – batch inference or single inference (default)
- Return bool
True
if passed health check
-
inference
(data)[source]¶ inference the preprocessed data
- Parameters
data – data after
preprocess
- Returns
inference result
-
postprocess
(data)[source]¶ postprocess the inference result
- Parameters
data – data after
inference
or one item of thebatch_inference
- Returns
as defined in
resp_schema
-
preprocess
(data)[source]¶ preprocess the data
- Parameters
data – as defined in
req_schema
- Returns
this will be the input data of
inference
or one item of the input data ofbatch_inference
-
run_http
(host=None, port=None)[source]¶ run the HTTP service
- Parameters
host (string) – host address
port (int) – service port
-
run_tcp
(host=None, port=None)[source]¶ run as an inference worker with TCP
- Parameters
host (string) – host address
port (int) – service port
-
run_unix
(addr=None)[source]¶ run as an inference worker with Unix domain socket
- Parameters
addr (string) – socket file address
-
property
sock
¶ socket used for communication with batching service
this is a instance of
ventu.protocol.BatchProtocol
Config¶
Check pydantic.BaseSettings
-
class
ventu.config.
Config
(_env_file: Optional[Union[pathlib.Path, str]] = '<object object>', *, name: str = 'Deep Learning Service', version: str = 'latest', host: str = 'localhost', port: ventu.config.ConstrainedIntValue = 8000, socket: str = 'batching.socket')[source]¶ default config, can be rewrite with environment variables begin with
ventu_
- Variables
name – default service name shown in OpenAPI
version – default service version shown in OpenAPI
host – default host address for the HTTP service
port – default port for the HTTP service
socket – default socket file to communicate with batching service
Protocol¶
-
class
ventu.protocol.
BatchProtocol
(infer, req_schema, resp_schema, use_msgpack)[source]¶ protocol used to communicate with batching service
- Parameters
infer – model infer function (contains preprocess, batch_inference and postprocess)
req_schema – request schema defined with pydantic
resp_schema – response schema defined with pydantic
use_msgpack (bool) – use msgpack for serialization or not (default: JSON)
-
process
(conn)[source]¶ process batch queries and return the inference results
- Parameters
conn – socket connection
HTTP service¶
-
class
ventu.service.
ServiceStatus
(*, inference: ventu.service.StatusEnum, service: ventu.service.StatusEnum = <StatusEnum.ok: 'OK'>)[source]¶ service health status
-
ventu.service.
create_app
(infer, metric_registry, health_check, req_schema, resp_schema, use_msgpack, config)[source]¶ create
falcon
application- Parameters
infer – model infer function (contains preprocess, inference, and postprocess)
metric_registry – Prometheus metric registry
health_check – model health check function (need examples provided in schema)
req_schema – request schema defined with
pydantic.BaseModel
resp_schema – request schema defined with
pydantic.BaseModel
use_msgpack (bool) – use msgpack for serialization or not (default: JSON)
config – configs
ventu.config.Config
- Returns
a
falcon
application