GCP Set Up
For users on our Spell for Teams plan, we deploy Spell in your cloud and provide the same cluster management tools backing our own internal infrastructure. This means you can keep your data in your own GS buckets, perform runs on your own machines, and deploy models within your own cloud infrastructure.
This guide gets you started using Spell in your GCP account.
Setting up GCP
- Make sure you have a GCP account. If you don’t, you can create one here.
- Make sure you have
gcloud
command line tools installed. If not, follow the instructions from Google’s help docs here. - Make sure you have set up Application Default Credentials for external libraries to use. You should run
gcloud auth application-default login
to set these credentials (for more information refer to the docs).
Setting up Spell
- Install Spell using
pip install --upgrade 'spell[cluster-gcp]'
. By specifying[cluster-gcp]
, the installation will include dependencies required specifically for GCP cluster deployment. - Log in to the Spell CLI by running
spell login
.
Setting up GCP resources
Next, we’ll need to set up your GCP resources. We’ve made this easy with the spell cluster init gcp
command.
Run spell cluster init gcp
and follow the prompts:
This command will help you
- Set up an Google Storage bucket to store your run outputs in
- Setup a VPC network which Spell will spin up workers in to run your jobs
- Create a subnet in the VPC
- Setup a Service Account allowing Spell to spin up and down machines and access the GS bucket
Enter y
to continue.
Enter a display name for this cluster within Spell:
Enter a name for your cluster.
---------------------------------------------
All of this will be done within your project '<project_name>' - continue? [Y/n]
If you wish to isolate Spell resources to a separate project, please set up this project before running this command. Confirm if this GCP project is appropriate.
The script will now create a service account that Spell will use to interact with resources in your project. First it creates a role with the necessary permissions:
---------------------------------------------
Creating role SpellAccess_1456882 with the following permissions:
compute.disks.create
compute.disks.list
compute.disks.resize
compute.globalOperations.get
compute.instances.create
compute.instances.delete
compute.instances.get
compute.instances.list
compute.instances.setLabels
compute.instances.setMetadata
compute.instances.setServiceAccount
compute.subnetworks.use
compute.subnetworks.useExternalIp
compute.zones.list
compute.regions.get
...
Then it creates a service account and assigns the role to it:
Assigning role SpellAccess_1456882 to service account spell-access-...@....iam.gserviceaccount.com...
Successfully set up service account spell-access-....@....iam.gserviceaccount.com
Next we need to create the GS bucket that we will use to store run outputs. We recommend create a brand new bucket for this purpose:
We recommend using an empty GS Bucket for Spell outputs. Would you like to make a new bucket or use an existing (new, existing): new
Please enter a name for the GS Bucket Spell will create for run outputs [spell-my-cluster]:
Give a name to the GS bucket where Spell will store your run outputs and Jupyter workspaces. Your name can consist of lowercase letters, periods, and dashes. To learn more about bucket naming rules, see here.
Created your new bucket spell-my-cluster!
Next the script will create a new Virtual Private Cluster where worker machines will run your training jobs. First, select a region for your VPC:
All of this will be done within this project's region 'us-west2' - continue? [Y/n]:
Enter Y
to continue.
Creating network...
[####################################] 100%
Created a new VPC/network with name gcp-cluster!
Creating firewall rule to allow ingress on ports [22, 2376, 9999]...
[####################################] 100%
Creating firewall rule to allow communication between instances within VPC...
[####################################] 100%
Firewall rules ready!
Creating subnetwork...
[####################################] 100%
Created a new subnet gcp7 within network gcp7 in region us-west2!
---------------------------------------------
Your cluster my-cluster is initialized! Head over to the web console to create machine types to execute your runs on - https://spell.ml/my-org/clusters/17
And you're done!
GCP limits
In order to create machines in this VPC, you'll need to make sure your machine limits enable that machine type. If you've just set up your GCP account, some of your limits may be set to 0, so you'll first need to request an increase before you can create machine types in Spell. GCP usually approves these pretty quickly.
Using a Custom Instance Service Account
You can interact with private GCP resources like Bigtable or Memorystore from within a run or workspace by setting a "Custom Instance Service Account" on your Spell cluster. All machines created within the cluster will have access to the permissions granted to that service account.
You can use the spell cluster set-instance-permissions
command (documentation available here) to configure the GCP IAM permissions you'd like your Spell runs to have access to. You can then update the permissions attached to the configured service account directly in GCP at any time and the permissions will be immediately reflected on your Spell machines. If you update your Spell cluster's custom instance profile to a new service account, you will have to destroy and recreate any existing machines because a GCE instances' service account is immutable.
Permissions
Below is a list of permissions Spell will need from your GCP project.
GCP Permissions
compute.disks.create
compute.disks.list
compute.disks.resize
compute.globalOperations.get
compute.instances.create
compute.instances.delete
compute.instances.get
compute.instances.list
compute.instances.setLabels
compute.instances.setMetadata
compute.instances.setServiceAccount
compute.subnetworks.use
compute.subnetworks.useExternalIp
compute.zones.list
compute.regions.get
Bucket Permissions
During spell cluster init gcp
we will add roles/storage.admin
role to the newly created service account. Any bucket added using add-bucket
will grant this service account the roles/storage.objectViewer
role.
Service Account Access
Spell will gain access to the newly created service account through the following roles:
roles/iam.serviceAccountTokenCreator
roles/iam.serviceAccountUser