Bucket Management

Users on the Spell for Teams plan deploy Spell into their cloud account. Once you have created your cluster, you can directly mount data from your AWS S3, GCP GCS, or Azure Blob Storage buckets into your runs. This document explains how.

How resources are managed

When you set up a Spell cluster, the setup script creates a new bucket that is used exclusively for run and Jupyter workspace outputs. This is the only bucket that Spell will ever write to directly. This backing bucket's name, chosen at cluster create time, is visible on the cluster details page.

Files saved to the runs/, uploads/, and workspace_exports/ paths in SpellFS are served from this bucket. To learn more, refer to the "Resources" page in the docs.

All other data that your organization will use need to be attached to the cluster. The rest of this page explains how.

Attaching a private bucket to a cluster

If a bucket is public you can mount resources from that bucket into your runs directly. See the section "Mounting Resources" of the Resources guide for more information.

If a bucket is private, before using it you must first add it to SpellFS by running spell cluster add-bucket:

$ spell cluster add-bucket
---------------------------------------------
All of this will be done within your project 'spell-external' - continue? [Y/n]: y
This command will
    - List your buckets to generate an options menu of buckets that can be added to Spell
    - Add list and read permissions for that bucket to the service account associated with the cluster
    - Ensure that the interoperable S3 access credentials associated with the cluster are able to read this bucket.
...
Please choose a bucket: spell-my-cluster
Bucket spell-my-cluster]has been added to cluster 1!

Now, when you run spell ls s3/ (AWS cluster), spell ls gs/ (GCP cluster), or spell ls azblob/ (Azure custer) you will see the new bucket:

$ spell ls gs/
-        -              spell-my-cluster

After doing this, you can mount objects from the private bucket the same way you would mount any other resource, using the mount flag:

spell run --mount gs/spell-my-cluster/file:file 'cat file'

(Advanced) Attaching a cross-account private bucket to a cluster

In certain cases, the bucket you want to give Spell runs access to is located outside of the cloud provider account you created your Spell cluster in.

To do this, you will need to create a bucket policy granting the Spell IAM role (see the section Role ARN on the cluster details page) read access and attach it to the bucket. Once you do so, run the following command:

$ spell cluster add-bucket --bucket $BUCKET_NAME --cross-account

Replacing $BUCKET_NAME with the name of the bucket you are granting Spell access to. If all goes well, this will attach the bucket to the cluster and you can proceed from there.

Refer to the blog post "Attaching private cross-account S3 buckets to Spell" for an explanation of how this feature works and a detailed walkthrough showing it in action.

This feature is currently only supported on AWS.