Quickstart for Ascend on Databricks
Introduction​
In this quickstart, you will setup Ascend on Databricks.
To complete this quickstart you need:
- An Ascend Instance
- A Databricks workspace with Unity Catalog enabled
- A terminal with:
- The Databricks CLI installed (
brew tap databricks/tap && brew install databricks
using Homebrew) - jq installed (
brew install jq
using Homebrew)
- The Databricks CLI installed (
You can use the GUIs for the setup below, though we recommend using the CLIs for automation and repeatability.
If you want to use separate Databricks workspaces per Ascend Environment, you will need to adjust the commands below. This guide uses one Databricks workspace across all Environments for simplicity while maintaining separation of Databricks Unity Catalog data and Databricks compute resources via separate Databricks service principals.
Ensure the output of each command is as expected before proceeding to the next step.
Set up the Databricks CLI​
If you followed an Ascend how-to guide for Databricks setup, you already have the Databricks CLI set up and can skip this section. Ensure you have the default profile set up for the Databricks workspace you want to use.
Check if the Databricks CLI is set up:
DATABRICKS_WORKSPACE_URL=$(databricks auth env | jq -r .env.DATABRICKS_HOST)
if [[ "$DATABRICKS_WORKSPACE_URL" ]]; then
echo "Using Databricks workspace:\n$DATABRICKS_WORKSPACE_URL"
fi
If the Databricks CLI is not set up, set the Databricks workspace URL:
DATABRICKS_WORKSPACE_URL=<your-databricks-workspace-url>
Open the Databricks workspace:
open $DATABRICKS_WORKSPACE_URL
In the Databricks UI, create a Personal Access Token (PAT) to configure the CLI:
databricks configure --host $DATABRICKS_WORKSPACE_URL
The Databricks CLI uses profiles to manage working with multiple Databricks workspaces.
Check for a Databricks Unity Catalog metastore​
Check the Databricks Workspace has a Databricks Unity Catalog metastore assigned:
databricks metastores current
If a Databricks Unity Catalog metastore is not set on your Databricks workspace, follow one of our Databricks how-to setup guides before proceeding.
Create Databricks service principals for Ascend​
Create one Databricks service principal for the Ascend Instance and one for each Ascend Environment.
Ascend Instance​
Create a Databricks service principal for the Ascend Instance:
INSTANCE_SP_APP_ID=$(databricks service-principals create \
--display-name "ascend-instance-sp" \
| jq -r '.applicationId')
echo $INSTANCE_SP_APP_ID
Ascend Dev Environment​
Create a Databricks service principal for the Ascend Dev Environment:
ENV_DEV_SP_APP_ID=$(databricks service-principals create \
--display-name "ascend-env-dev-sp" \
| jq -r '.applicationId')
echo $ENV_DEV_SP_APP_ID
Ascend Staging Environment​
Create a Databricks service principal for the Ascend Staging Environment:
ENV_STAGING_SP_APP_ID=$(databricks service-principals create \
--display-name "ascend-env-staging-sp" \
| jq -r '.applicationId')
echo $ENV_STAGING_SP_APP_ID
Ascend Prod Environment​
Create a Databricks service principal for the Ascend Prod Environment:
ENV_PROD_SP_APP_ID=$(databricks service-principals create \
--display-name "ascend-env-prod-sp" \
| jq -r '.applicationId')
echo $ENV_PROD_SP_APP_ID
Create Databricks compute for the Ascend Instance​
Create a Databricks warehouse for the Ascend Instance Store:
WH_ID_INSTANCE=$(databricks warehouses create \
--cluster-size "2X-Small" \
--auto-stop-mins 5 \
--min-num-clusters 1 \
--max-num-clusters 1 \
--enable-photon \
--enable-serverless-compute \
--no-wait \
--name "ascend-instance" \
| jq -r '.id')
echo $WH_ID_INSTANCE
Set up Databricks catalogs and schemas for Ascend​
In this step, you will run SQL commands to create a catalog and schema for the Ascend Instance and give permissions to the corresponding service principal.
Ensure you are seeing "SUCCEEDED"
as the status.state
after running each SQL command below.
If you prefer, you can run these SQL commands in a query editor or notebook in the Databricks UI. Using the CLI avoids copying and pasting Databricks service principal application IDs and reduces the risk of error.
First, verify the current metastore is setup correctly:
SQL="select current_metastore()"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
Set up Ascend Instance catalog and schema​
Set the variables for the Databricks Unity Catalog catalog and schema the Ascend Instance Store will use:
ASCEND_INSTANCE_CATALOG="ascend_instance_data"
ASCEND_INSTANCE_SCHEMA="instance_data"
Create a Databricks Unity Catalog catalog for the Ascend Instance Store to use:
SQL="CREATE CATALOG IF NOT EXISTS $ASCEND_INSTANCE_CATALOG"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
And a schema:
SQL="CREATE SCHEMA IF NOT EXISTS $ASCEND_INSTANCE_CATALOG.$ASCEND_INSTANCE_SCHEMA"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
Give the Ascend Instance's corresponding Databricks service principal access to the Databricks Unity Catalog catalog:
SQL="GRANT USE CATALOG ON CATALOG $ASCEND_INSTANCE_CATALOG TO \`$INSTANCE_SP_APP_ID\`"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
And schema:
SQL="GRANT ALL PRIVILEGES ON SCHEMA $ASCEND_INSTANCE_CATALOG.$ASCEND_INSTANCE_SCHEMA TO \`$INSTANCE_SP_APP_ID\`"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
Set up Ascend Dev Data Plane catalog and schema​
Set the variables for the Databricks Unity Catalog catalog and schema the Ascend Dev Data Plane will use:
ASCEND_DEV_CATALOG="ascend_data_plane_dev"
ASCEND_DEV_SCHEMA="default"
Create a Databricks Unity Catalog catalog for the Ascend Dev Data Plane to use:
SQL="CREATE CATALOG IF NOT EXISTS $ASCEND_DEV_CATALOG"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
And a schema:
SQL="CREATE SCHEMA IF NOT EXISTS $ASCEND_DEV_CATALOG.$ASCEND_DEV_SCHEMA"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
Give the Ascend Dev Environment's corresponding Databricks service principal access to the Databricks Unity Catalog catalog:
SQL="GRANT USE CATALOG ON CATALOG $ASCEND_DEV_CATALOG TO \`$ENV_DEV_SP_APP_ID\`"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
And schema:
SQL="GRANT ALL PRIVILEGES ON SCHEMA $ASCEND_DEV_CATALOG.$ASCEND_DEV_SCHEMA TO \`$ENV_DEV_SP_APP_ID\`"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
Set up Ascend Staging Data Plane catalog and schema​
Set up the variables for the Databricks Unity Catalog catalog and schema the Ascend Staging Data Plane will use:
ASCEND_STAGING_CATALOG="ascend_data_plane_staging"
ASCEND_STAGING_SCHEMA="default"
Create a DataBricks Unity Catalog catalog for the Ascend Staging Data Plane to use:
SQL="CREATE CATALOG IF NOT EXISTS $ASCEND_STAGING_CATALOG"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
And a schema:
SQL="CREATE SCHEMA IF NOT EXISTS $ASCEND_STAGING_CATALOG.$ASCEND_STAGING_SCHEMA"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
Give the Ascend Staging Environment's corresponding Databricks service principal access to the Databricks Unity Catalog catalog:
SQL="GRANT USE CATALOG ON CATALOG $ASCEND_STAGING_CATALOG TO \`$ENV_STAGING_SP_APP_ID\`"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
And schema:
SQL="GRANT ALL PRIVILEGES ON SCHEMA $ASCEND_STAGING_CATALOG.$ASCEND_STAGING_SCHEMA TO \`$ENV_STAGING_SP_APP_ID\`"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
Set up Ascend Prod Data Plane catalog and schema​
Set up the variables for the Databricks Unity Catalog catalog and schema the Ascend Prod Data Plane will use:
ASCEND_PROD_CATALOG="ascend_data_plane_prod"
ASCEND_PROD_SCHEMA="default"
Create a Databricks Unity Catalog catalog for the Ascend Prod Data Plane to use:
SQL="CREATE CATALOG IF NOT EXISTS $ASCEND_PROD_CATALOG"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
And a schema:
SQL="CREATE SCHEMA IF NOT EXISTS $ASCEND_PROD_CATALOG.$ASCEND_PROD_SCHEMA"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
Give the Ascend Prod Environment's corresponding Databricks service principal access to the Databricks Unity Catalog catalog:
SQL="GRANT USE CATALOG ON CATALOG $ASCEND_PROD_CATALOG TO \`$ENV_PROD_SP_APP_ID\`"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
And schema:
SQL="GRANT ALL PRIVILEGES ON SCHEMA $ASCEND_PROD_CATALOG.$ASCEND_PROD_SCHEMA TO \`$ENV_PROD_SP_APP_ID\`"
databricks api post /api/2.0/sql/statements --json \
'{"warehouse_id": "'"$WH_ID_INSTANCE"'", "statement": "'"$SQL"'"}'
Create Databricks compute for Ascend Data Planes​
Create a Databricks cluster for each Ascend Data Plane.
We recommend creating the all-purpose cluster in the Databricks UI to to customize your cluster. You can copy the JSON from the GUI there for re-use in the CLI or other automation.
Choose the cluster type ID​
The Databricks cluster type ID corresponds to cloud-specific compute types. Choose your cloud provider:
- AWS
- Azure
- GCP
NODE_TYPE_ID="m5.large"
NODE_TYPE_ID="Standard_D4ds_v5"
NODE_TYPE_ID="n1-standard-4"