Model Servers¶
See here for additional information regarding model servers.
- class
SpellClient.
servers
ModelServer¶
- class
ModelServer
¶ Object representing a Spell model server.
-
created_at
¶ Model server creation timestamp
- Type
-
updated_at
¶ Timestamp for the last time an action was performed on this server.
- Type
-
cluster
¶ Model serving cluster configuration details such as provider, region, subnet, and cloud provider credentials.
- Type
-
model_versions
¶ A list of
ModelVersion
object containing information on the models being served. See the corresponding docs for more information.- Type
list of ModelVersion
-
git_commit_hash
¶ Commit hash fingerprinting the version of the code this server is running.
- Type
-
pods
¶ Lists current and historic Kubernetes pods that served or are serving this server.
- Type
list of
ModelServerPod
-
creator
¶ The Spell user who created this model server initially.
- Type
User
-
resource_requirements
¶ The resource requirements and limits currently set for this model server. To learn more refer to the model server documentation.
- Type
ContainerResourceRequirements
-
pod_autoscale_config
¶ A mapping of server performance configuration values:
min_pods
,max_pods
,target_cpu_utilization
,target_requests_per_second
.- Type
PodAutoscaleConfig
-
additional_resources
¶ Lists additional files (besides the model) attached to this model server.
- Type
list of
Resource
-
batching_config
¶ Batching configuration details. Refer to the corresponding section of the docs for more information.
- Type
BatchingConfig
-
environment
¶ A mapping of additional
pip
andapt
dependencies installed onto this model server.- Type
Environment
-
healthcheck
(**kwargs)¶ Query the model server HTTPS health check endpoint.
- Parameters
**kwargs – additional keyword arguments to be passed to
requests.get
. For eaxmple,timeout parameter may be helpful for the case that the server request hangs. (the) –
- Returns
the server response. Use the
ok
field,status_code
field, orraise_for_status
object method to verify server health.- Return type
requests.Response
- Raises
requests.exceptions.RequestException – an error occurred.
-
logs
(pod, follow=False, offset=None)¶ Get long entries for a model server.
- Parameters
pod (int) – the ID of the pod to get logs for. For a list of pods for this model server (and their associated IDs) refer to the attribute
pods
.follow (bool, optional) – follow the log lines until the server reaches a final status (default:
False
).offset (int, optional) – which log line to start from. Note that this value, if set, must be a positive integer value (default:
None
).
-
metrics
(metric_name, follow=False, start=None)¶ Get a server metric. Metrics are sorted by tag.
- Parameters
metric_name (str) – the name of the metric being fetched.
- Raises
ClientException – an error occured.
-
predict
(payload, **kwargs)¶ Query the model server HTTPS endpoint.
- Parameters
payload (dict) – a JSON serializable dictionary containing query parameters understood by your model server.
**kwargs – additional keyword arguments to be passed to
requests.post
.
- Returns
the server response.
- Return type
requests.Response
- Raises
requests.exceptions.RequestException – an error occurred.
-
refresh
()¶ Refresh the model server state.
Refresh all of the server attributes with the latest information for the server from Spell.
- Raises
ClientException – an error occurred.
-
start
()¶ Starts the model server.
- Raises
ClientException – an error occured.
-
stop
()¶ Stops the model server.
- Raises
ClientException – an error occured.
-
update
(**kwargs)¶ Updates the model server.
- Parameters
( (models) – obj: list of
str
, optional): Targeted models, should be inMODEL:VERSION
formatentrypoint (str, optional) – Path to the file to be used as the model server entrypoint, e.g.
serve.py
or similar.github_url (str, optional) – a GitHub URL to a repository for code to include in the server.
github_ref (str, optional) – a reference to a commit, branch, or tag in the repository corresponding to the
github_url
for code to include in the run (default:master
).commit_ref (str, optional) – git commit hash to use (default:
HEAD
).node_group (str, optional) – Name of the node group to serve from. Defaults to the default node group.
classname (str, optional) – Name of the
Predictor
class. Only required if more then one predictor exists in the entrypoint.pip_packages (
list
ofstr
, optional) – pip dependencies (default:None
). For example:["moviepy", "scikit-image"]
.requirements_file (str, optional) – a path to a requirements file
conda_file (str, optional) – a path to a conda environment file
apt_packages (
list
ofstr
, optional) – apt dependencies (default:None
). For example:["python-tk", "ffmpeg"]
envvars (
dict
ofstr
->str
, optional) – name to value mapping of environment variables for the server (default:None
).attached_resources (
dict
ofstr
->str
, optional) – resource name to mountpoint mapping of attached resouces for the run (default:None
). For example:{"runs/42" : "/mnt/data"}
resource_requirements (
dict
ofstr
->str
, optional) – Configuration mapping for node resource requirements:cpu_limit
,cpu_request
,ram_limit
,ram_request
,gpu_limit
. Has sane default values.num_processes (
int
) – The number of processes to run the model server on. By default this is(2 * numberOfCores) + 1
or equal to the available GPUs if applicable.pod_autoscale_config (
dict
ofstr
->str
, optional) – configuration mapping for pod autoscaling:min_pods
,max_pods
,target_cpu_utilization
,target_requests_per_second
. Has sane default values.enable_batching (
bool
, optional) – Whether or not to enable model server batching. Defaults toFalse
.batching_config (
dict
ofstr
->str
, optional) – If model server batching is enabled, the values passed to this parameter are used to configure it:max_batch_size
,request_timeout
. If left empty, the default batching parameter values will be used.debug (
bool
, optional) – Launches the model server in debug mode. Should not be used in production.
- Raises
ClientException – an error occured.
-
wait_status
(*statuses)¶ Wait until the model server achieves one of the given statuses and then return.
-