Model Servers

See here for additional information regarding model servers.

class SpellClient.servers
class ModelServersService(client)

A class for managing Spell model servers.

list()

Lists model servers.

Parameters

name (str) – Model server name

Returns

A list of ModelServer objects.

Raises

ClientException – an error occured.

get(name)

Get a model server.

Parameters

name (str) – model server name

Returns

A ModelServer.

Raises

ClientException – an error occured.

rm(name)

Remove a model server.

Parameters

name (str) – Model server name

Raises

ClientException – an error occured.

serve(model, entrypoint, github_url, **kwargs)

Create a new model server using a model.

Parameters
  • model (str) – Targeted model, should be in MODEL:VERSION format

  • entrypoint (str) – Path to the file to be used as the model server entrypoint, e.g. serve.py or similar.

  • github_url (str) – a GitHub URL to a repository for code to include in the server.

  • github_ref (str, optional) – a reference to a commit, branch, or tag in the repository corresponding to github_url for code to include in the run (default: master).

  • commit_ref (str, optional) – git commit hash to use (default: HEAD).

  • name (str, optional) – Name of the model server. Defaults to the model name.

  • node_group (str, optional) – Name of the node group to serve from. Defaults to the default node group.

  • classname (str, optional) – Name of the Predictor class. Only required if more then one predictor exists in the entrypoint.

  • pip_packages (list of str, optional) – pip dependencies (default: None). For example: ["moviepy", "scikit-image"].

  • apt_packages (list of str, optional) – apt dependencies (default: None). For example: ["python-tk", "ffmpeg"]

  • requirements_file (str, optional) – a path to a pip requirements file.

  • envvars (dict of str -> str, optional) – name to value mapping of environment variables for the server (default: None).

  • attached_resources (dict of str -> str, optional) – resource name to mountpoint mapping of attached resouces for the run (default: None). For example: {"runs/42" : "/mnt/data"}

  • resource_requirements (dict of str -> str, optional) – configuration mapping for node resource requirements: CPU, GPU, RAM, etcetera. Has sane default values.

  • num_processes (int) – The number of processes to run the model server on. By default this is (2 * numberOfCores) + 1, or equal to the available GPUs if applicable.

  • pod_autoscale_config (dict of str -> str, optional) – configuration mapping for pod autoscaling: min_pods, max_pods, target_cpu_utilization, target_requests_per_second. Has sane default values.

  • enable_batching (bool, optional) – Whether or not to enable model server batching. Defaults to False.

  • batching_config (dict of str -> int, optional) – If model server batching is enabled, the values passed to this parameter are used to configure it. If left empty, the default batching parameter values will be used. Has two keys: max_batch_size and request_timeout.

  • description – (str, optional): Model server description, defaults to None.

  • debug (bool, optional) – Launches the model server in debug mode. Should not be used in production.

Raises

ClientException – an error occured.

ModelServer

class ModelServer

Object representing a Spell model server.

id

Model server id

Type

int

server_name

Model server name

Type

str

status

Model server status (e.g. Running, Stopped)

Type

str

url

Model server endpoint URL

Type

str

created_at

Model server creation timestamp

Type

datetime.datetime

updated_at

Timestamp for the last time an action was performed on this server.

Type

datetime.datetime

cluster

Model serving cluster configuration details such as provider, region, subnet, and cloud provider credentials.

Type

dict

model_version

A ModelVersion object containing information on the model being served. See the corresponding docs for more information.

Type

ModelVersion

entrypoint

The model server entrypoint (e.g. serve.py).

Type

str

workspace

Details describing the git repository the model server was launched from.

Type

dict

git_commit_hash

Commit hash fingerprinting the version of the code this server is running.

Type

str

pods

Lists current and historic Kubernetes pods that served or are serving this server.

Type

list of ModelServerPod

creator

The Spell user who created this model server initially.

Type

User

resource_requirements

The resource requirements and limits currently set for this model server. To learn more refer to the model server documentation.

Type

ContainerResourceRequirements

pod_autoscale_config

A mapping of server performance configuration values: min_pods, max_pods, target_cpu_utilization, target_requests_per_second.

Type

PodAutoscaleConfig

additional_resources

Lists additional files (besides the model) attached to this model server.

Type

list of Resource

batching_config

Batching configuration details. Refer to the corresponding section of the docs for more information.

Type

BatchingConfig

environment

A mapping of additional pip and apt dependencies installed onto this model server.

Type

Environment

healthcheck(**kwargs)

Query the model server HTTPS health check endpoint.

Parameters
  • **kwargs – additional keyword arguments to be passed to requests.get. For eaxmple,

  • timeout parameter may be helpful for the case that the server request hangs. (the) –

Returns

the server response. Use the ok field, status_code field, or raise_for_status object method to verify server health.

Return type

requests.Response

Raises

requests.exceptions.RequestException – an error occurred.

logs(pod, follow=False, offset=None)

Get long entries for a model server.

Parameters
  • pod (int) – the ID of the pod to get logs for. For a list of pods for this model server (and their associated IDs) refer to the attribute pods.

  • follow (bool, optional) – follow the log lines until the server reaches a final status (default: False).

  • offset (int, optional) – which log line to start from. Note that this value, if set, must be a positive integer value (default: None).

metrics(metric_name, follow=False, start=None)

Get a server metric. Metrics are sorted by tag.

Parameters

metric_name (str) – the name of the metric being fetched.

Raises

ClientException – an error occured.

predict(payload, **kwargs)

Query the model server HTTPS endpoint.

Parameters
  • payload (dict) – a JSON serializable dictionary containing query parameters understood by your model server.

  • **kwargs – additional keyword arguments to be passed to requests.post.

Returns

the server response.

Return type

requests.Response

Raises

requests.exceptions.RequestException – an error occurred.

refresh()

Refresh the model server state.

Refresh all of the server attributes with the latest information for the server from Spell.

Raises

ClientException – an error occurred.

start()

Starts the model server.

Raises

ClientException – an error occured.

stop()

Stops the model server.

Raises

ClientException – an error occured.

update(**kwargs)

Updates the model server.

Parameters
  • model (str, optional) – Targeted model, should be in MODEL:VERSION format

  • entrypoint (str, optional) – Path to the file to be used as the model server entrypoint, e.g. serve.py or similar.

  • github_url (str, optional) – a GitHub URL to a repository for code to include in the server.

  • github_ref (str, optional) – a reference to a commit, branch, or tag in the repository corresponding to the github_url for code to include in the run (default: master).

  • commit_ref (str, optional) – git commit hash to use (default: HEAD).

  • node_group (str, optional) – Name of the node group to serve from. Defaults to the default node group.

  • classname (str, optional) – Name of the Predictor class. Only required if more then one predictor exists in the entrypoint.

  • pip_packages (list of str, optional) – pip dependencies (default: None). For example: ["moviepy", "scikit-image"].

  • apt_packages (list of str, optional) – apt dependencies (default: None). For example: ["python-tk", "ffmpeg"]

  • requirements_file (str, optional) – a path to a requirements file

  • envvars (dict of str -> str, optional) – name to value mapping of environment variables for the server (default: None).

  • attached_resources (dict of str -> str, optional) – resource name to mountpoint mapping of attached resouces for the run (default: None). For example: {"runs/42" : "/mnt/data"}

  • resource_requirements (dict of str -> str, optional) – Configuration mapping for node resource requirements: cpu_limit, cpu_request, ram_limit, ram_request, gpu_limit. Has sane default values.

  • num_processes (int) – The number of processes to run the model server on. By default this is (2 * numberOfCores) + 1 or equal to the available GPUs if applicable.

  • pod_autoscale_config (dict of str -> str, optional) – configuration mapping for pod autoscaling: min_pods, max_pods, target_cpu_utilization, target_requests_per_second. Has sane default values.

  • enable_batching (bool, optional) – Whether or not to enable model server batching. Defaults to False.

  • batching_config (dict of str -> str, optional) – If model server batching is enabled, the values passed to this parameter are used to configure it. If left empty, the default batching parameter values will be used.

  • debug (bool, optional) – Launches the model server in debug mode. Should not be used in production.

Raises

ClientException – an error occured.