Quickstart

Spell is a platform for training and deploying machine learning models quickly and easily. This quickstart guide will walk through training your first machine learning model using the Spell CLI. You can follow along with either written or video tutorials below, both using our sample CIFAR10 training code. Note the video covers features in Spell for Teams which may not be applicable to Community users.

Logging in

Before you start, make sure that you have Python and Git installed. If you haven't registered yet, create a free account now on our registration page. Then install the Spell Python package using pip:

$ pip install spell

Once you have the Spell CLI installed, verify that everything is working as expected by running spell --help. This should output a helpful list of subcommands.

Before you can do anything useful, you first need to log in:

$ spell login

This will open the Spell Web Console in a web browser. You will be asked if you want to authorize the local application to have access to your Spell account. If your login is successful you will see the following greeting:

Hello <username>!

Alternatively, you may specify your Spell username and password with command line flags:

$ spell login --identity <username> --password <password>

If you receive an error message and are sure you are using the correct password and username, please contact us at support@spell.ml.

You can check your current login status at any time using:

$ spell whoami

Your first run

Runs are one of the foundations of Spell. Creating a run is how you execute your code on Spell's computing infrastructure, and the spell run command is likely the command you'll use most while using Spell.

Each run in Spell is an instance of a single computational job executed on our infrastructure. Runs are typically executed from inside a Git repository. Executing a run will:

  1. Sync the contents of the repository with Spell.
  2. Spin up a machine (or set of machines!) on the cloud, and execute your job on those machine(s).
  3. Save any file outputs from those jobs to our filesystem, SpellFS, for later access.

To execute a run, use spell run. The simplest command you can run on your computer is echo "hello world", which will print hello world to the screen. To run this on Spell:

$ spell run "echo hello world"

Which outputs:

✨ Casting spell #1…
✨ Stop viewing logs with ^C
...
✨ Run is running
hello world

The run workflow

To dig a bit deeper into runs, let's try training a convolutional neural network (CNN) on the CIFAR10 dataset using Spell. We will use a simple training script from our spellml/cnn-cifar10 repository on GitHub (this example uses PyTorch, but TensorFlow 2 works just as well).

Clone the repository, then cd into it in your terminal. Then run the following code:

$ spell run --machine-type t4 \
    python models/train_basic.py

This will launch a model training job on an NVIDIA T4 instance on the cloud (if you do not have access to a T4, try using --machine-type cpu instead). You can track the progress of this run in the CLI, or alternatively by navigating to the web console and navigating to the run details page for the run you just launched:

Run summary

The run details page includes all of the information you need to reproduce the run—the command that was run and code repository that was used, the exact commit hash that repository was at, any pip/apt/conda-file dependencies you installed, etcetera. All of the run logs are saved here too.

One of the value-adds that Spell provides to your runs is metrics. Spell automatically saves and displays hardware metrics like CPU and GPU utilization for you. This demo script logs an additional metric, train_loss, using the send_metric command in Spell's Python API; this too is saved to and displayed on the run details page:

Run metrics

Any files the run writes to its current working directory will appear here too, as will run logs:

Run outputs and logs

Congratulations, you've now trained your first machine learning model on Spell!

Using workspaces

The next major feature we will cover is workspaces. Workspaces are JupyterLab instances running on the cloud. Workspaces are designed to replicate your local machine learning development. But because workspaces are on the cloud they are more easily replicable, scalable, and shareable.

You can launch a workspace from the web console. You'll be asked for a name and (optionally) a git repository to initialize the workspace files from. For the purposes of this demo, let's reuse the spellml/cnn-cifar10 repo:

One of workspace create screens

Next you set your environment variables, machine type and any additional apt, pip, and/or conda-file dependencies; and toggle Jupyter Lab or Notebook. After that you can optionally mount any resources you need.

Once you've confirmed your settings, the workspace will be created, the page will refresh, and you can start coding.

Workspace landing page

You can start, stop, restart, or clone a workspace at any time to pick up right where you left off.

Using model servers

Note

This feature is only available to users on Spell for Teams.

Once you've trained your model using Spell runs and/or workspaces, the next step is deploying them. For that, you can use Spell model servers.

Model servers on Spell are (Kubernetes-based) serving clusters that you spin up and we maintain for you. To demonstrate how they work, let's serve a simple serving script encapsulating the model we just finished training.

The first step is creating a model. A model is a group of resources, sourced from a run or an upload, that encapsulate all of the model weights and configuration files our serving script will need:

$ spell model create cnn-cifar10 runs/$RUN_ID \
  --file checkpoints/model_final.pth

Replace $RUN_ID with the ID number of the run we just finished executing. You can view a list of all of the models you've created by visiting the models page in the web console:

Models list page

To serve the model, we combine this model artifact with a model serving script. A model serving script is a simple *.py file with the following basic format:

from spell.serving import BasePredictor

class Predictor(BasePredictor):
    def __init__(self):
        pass

    def predict(self, payload):
        pass

To serve our example script, run the following CLI command (make sure you are in the root of the spellml/cnn-cifar10 git repository on your local machine first):

$ spell server serve \
  --node-group default \
  --pip pillow \
  cnn-cifar10:v1 server/serve.py

Executing this command will automatically take you to the details page for this model server in the web console:

Model server details page

Once the model server is ready to serve traffic, you can test it out for yourself. Grab the URL on the model server summary page—this is the endpoint we need to hit—and a picture of one of the ten classes in the CIFAR10 dataset. I used this picture of a coworker's cat:

Noah's cat

Then try running the following Python code:

from PIL import Image
from io import BytesIO
import requests
import base64

img = Image.open("cat.jpg")
img.convert("RGB")
buf = BytesIO()
img.save(buf, format="JPEG")
img_str = base64.b64encode(buf.getvalue())

resp = requests.post(
  "https://$SPELL_ORG.spell.services/$SPELL_ORG/$SERVER_NAME/predict",
  headers={"Content-Type": "application/json"},
  json={
    "image": img_str.decode("utf8"),
    "format": "JPEG"
})

print(resp.json())

This code packages the image bytes into a base 64 encoded JSON string understood by this model server, sends it, then displays the JSON response.

{'class': 'Cat'}

Next steps

That concludes this Quickstart!

This brief tour covers the three most important features in Spell: runs, workspaces, and model servers.

For a brief tour of the rest of Spell's core features, check out Core Concepts. Alternatively, check the user guide in the sidebar to learn more about specific features that may be valuable.

Though this quickstart is focused on the Spell CLI, everything that we did here can be done using the Spell Python client as well. To learn more about using Spell's Python API, check out the Python quickstart in our spellml/examples repository on GitHub.