Product Release Changelog

Here at Spell, we’re releasing new features, fixing bugs, and updating documentation daily. This changelog is an account of user-facing changes updated on a monthly cadence. For more information please reach out at support@spell.ml.

December 2021

Update on recent Log4j vulnerabilities

On December 9th, 2021 Alibaba Cloud publicly disclosed the Log4Shell (CVE-2021-44228) zero-day vulnerability in Log4j, involving arbitrary code execution, affecting hundreds of millions of devices. Spell proprietary code is entirely unaffected, and our team has conducted a thorough investigation of other OSS dependencies; Spell today uses Kafka and Elasticsearch, both written in Java. Our Kafka version is unaffected by the vulnerability, and we’ve upgraded Elasticsearch to the latest secure version as of December 15th, 2021.

Other Feature Improvements

  • [Workspaces] Enable JupyterLab Git extension to streamline git credential input and git command usage from Workspaces
  • [Workspaces] Remove need to authenticate with spell (e.g. spell login) when running Spell commands in Jupyter Workspaces
  • [UX] Autofill example fields in Create Run button from empty runs page to Pytorch MNIST example

Bug Fixes

  • [Run orchestration] Fixed bug causing private pip packages to fail installation
  • [Workspaces] Fixed bug in workspaces where git init was failing to execute, leading to build failures Additional documentation for connecting Jupyter workspaces to VSCode Remote

November 2021

Updates to default framework and dockerfile

We've updated the following frameworks and packages in the default image. Please note these changes may require migration work for compatibility, and if your team needs an older version contact us at support@spell.ml:

  • CUDA 10.1 to 11.3.1
  • PyTorch 1.8.1 to 1.10.0
  • Tensorflow 2.3.4 to 2.6.2
  • Horovod 0.21.3 to 0.23

Support full --pip specifications in workspace notebooks and model servers

Notebooks and model servers can now support a wide variety of version specifications like requests>=0.2,<4.1,!=3.1.2 and full git specification for pip, such as git+http://repo/my_project.git#egg=SomeProject

Other Feature Improvements

  • [Run orchestration] Environment variables that start with SECRET are now anonymized in the web console and appear as <REDACTED>
  • [Run orchestration] --framework parameter is now deprecated from CLI and Python API. In order to use a specific ML framework, use the --pip or --conda-env flag or use a custom docker image
  • [Model servers] --instance-type is no longer a required option for node-group add, and will default to m5.large (AWS) or n1-standard-2 (GCP)
  • [Model servers] --name option in node-group add flagged for deprecation. This is now an argument to the command.
  • [Workspaces] Specified idle timeout now applies to idle Workspace terminal instances
  • [Docs] Documentation updates on secret environment variables, pip specifications

Bugfixes

  • [Run orchestration] Added warning message for GCP machine name validation rules
  • [Model servers] Fixed bug for unintended behavior when kube-cluster create fails
  • [Model servers] Fix runtime exceptions for node-group delete
  • [Model servers] Fix eks-cluster-not-found exception when AWS profile does not have same region as kube-cluster
  • [Experiments] Fixed bug preventing bulk adding runs to experiments

October 2021

Model server update and QoL changes

Beginning this month, all newly created EKS clusters will be Kubernetes Version 1.18. Furthermore, we've made QoL improvements to kube-cluster subcommands: kube-cluster create no longer re-prompts for configuration values if --use-existing is used; kube-cluster create and kube-cluster update no longer prompt for a kubectl context if more than one kubectl context exists; kube-cluster create and kube-cluster update no longer rely on the user's current kubectl context remaining static.

Other Feature Improvements

  • [Run orchestration] Log throttling values have been updated from 450 logs every 15 seconds per run to 1000 loglines every 10 seconds, to reduce instances of dropped logs on user runs
  • [Run orchestration] Changed query logic for label filters, to speed up instances where organizations have 100+ labels
  • [Model Servers] Removed deprecated commands in spell cluster which affect model servers. From now on all model server related commands are part of spell kube-cluster.
  • [Model Servers] Speed up model server start up times by moving static pip installs to the default image instead of the templates
  • [Model Servers] Added --docs flag to spell server mounts and spell server models commands
  • [Model Servers] Added model file information in spell model describe modelName:modelVersion command
  • [Model Servers] Provide more human-readable errors when user attempts to initialize model servers before kube-cluster create
  • [Experiments] Add support for multiple aggregation columns for the same metric
  • [UX] Implemented mechanism to require users to initialize cluster during Spell for Team setup
  • [UX] Added tooltip to web console warning users of log throttling behavior under high volume
  • [Workspaces] Switched default terminal from sh to bash
  • [Website] Revamped and updated FAQs page

Bugfixes

  • [Run orchestration] Fixed bug causing file mount timeouts in Azure
  • [Model Servers] Fixed display bugs in model filepath when using multiple model servers
  • [Model Servers] Fixed bug preventing users from updating parts of their model belonging to a private Github repo
  • [Model Servers] Fixed bug in listing model servers
  • [Model Servers] Fixed bug preventing mounting the base of a bucket in a model server
  • [Model Servers] Fixed bug causing unintended side effects from kube-cluster delete
  • [Private Machines] Fixed bug occasionally causing private machines to hang
  • [Hyper search] Fixed bug where labels were failing to display during hyperparameter searches
  • [Hyper search] Fixed bug causing misbehavior of spell hyper list
  • [Experiments] Fix graphical bugs in experiment chart rendering
  • [Workspaces] Fixed bug causing Jupyter workspace timeout after 5ms

September 2021

Google SSO

We’ve implemented Google SSO and Oauth for all users! With this update, users can conveniently and securely sign up and log in to Spell via their existing Google accounts. Additionally, users belonging to an org can take advantage of Google authentication features such as enabling 2-factor authentication for their org members. Existing users can choose to link their accounts to Google SSO or use our existing Spell authentication system.

Multiple models in model servers

We've added direct support for multi-model model servers as well as 0-model model servers. Some relevant changes include updating the spell server serve [model] [entrypoint] to now spell server serve [model1, model2 ...] [entrypoint], as well as spell server models add and spell server models rm commands.

JupyterLab 3.0 Extension Upgrade

Spell has upgraded from major version 2 to JupyterLab 3.0 released in January 2021. Among the many improvements, notably users can now directly can now pip install $JUPYTERLAB_EXTENSION_PACKAGE_NAME and the extension will work next time you launch JupyterLab.

Other improvements and fixes

  • [Runs, Workspaces, Model Servers] Added full support for requirements.txt specifications handling, including comments, multi-line requirements, and options such as --extra-index-url, --index-url, and --find-links
  • [Runs] Upgraded default AMI used for orchestration to include latest changes in Conda and Jupyter
  • [Runs] Improved error message when improper file is mounted
  • [UX] Various style updates to web console sidebar for improved readability
  • [UX] Fixed display bugs for displaying empty state model servers pages when user has no cluster set up
  • [UX] Consolidated user and billing information tabs for more efficient navigation
  • [UX] Graphical updates to web console tabs
  • [UX] On account creation, directly link to web login when user email validates account
  • [UX] Redesigned billing module for developer accounts to fix inconsistent styling and display bugs
  • [UX] Display raw logs button in run page even when logs display module has not fully loaded
  • [UX] Fixed bug in accounts page that constantly displayed “missing token”
  • [Model servers] Add warnings for newly available kube cluster versions when running spell kube-cluster command
  • [Model servers] Rearranged model serving cURL params for easier editing
  • [Model servers] Changed model server web console headers to breadcrumb-style for improved readability
  • [Docs] Updated installation instructions on self-serve trial signup
  • [Docs] Refreshes and updates to documentation for Workspaces, Resources, Tensorboard / WandB integrations pages
  • [Docs] Small fixes to section headers and links for Quickstart and Workflows documentation