Skip to main content
Version: 3.0.0

Profiles

Profiles customize how Projects and Flows behave across different Environments (e.g., Development, Staging, Production). They manage settings that influence data processing, connection handling, and resource allocation.

  • Environment Customization: Adjust settings for different environments, ensuring consistent behavior without altering the codebase.
  • Operational Flexibility: Maintain a single codebase while adapting parameters to fit specific stages or conditions, reducing duplication and enhancing efficiency.

Benefits

  • Consistency: Centralized settings ensure predictable workflow behavior across environments.
  • Flexibility: Adjust workflows quickly without redeploying or modifying core components.
  • Simplified Management: Consolidate configuration details, making maintenance and updates easier.

How Profiles Work

There are two main types of settings defined in your Profile:

  1. Default Values: Profiles can specify default settings for each Flow in a project. For example, you can specify the default Connection to use as your Data Plane for all flows in a project. If you'd like to use specific Data Plane Connections for specific Flows, you can do so using regular expressions. Each default configuration has a kind (which determines the kind of resource being configured), a name (which is used to match the resource to be configured), and a spec (which contains the configuration for the resource).

  2. Runtime Parameters: These are used to adapt the runtime behavior of your Flow. Parameters are defined as key-value pairs within your Profile, under the parameters field.

Here is an example of a Profile with both Parameters and Defaults. In this example, Snowflake Connection parameters are set for a development Snowflake environment and the default Data Plane for all Flows in this Profile is set to the snowflake Connection.

dev-snowflake.yaml
profile:
parameters:
vault_name: dev-vault
snowflake_database: dev-db
snowflake_user: dev-user
snowflake_schema: dev-schema
snowflake_warehouse: dev-warehouse

defaults:
- kind: Flow
name:
regex: .*
spec:
data_plane:
connection_name: snowflake

Use Cases

CI/CD with Profiles

Profiles allow for seamless promotion of your Flows from your Ascend Development Environment to your Staging or Production Environment as you can create separate Profiles for each stage in the Software Development Lifecycle (SDLC). For example, the most common scenario is using a Profile to parameterize a Connection, so you can ensure you are reading from development data sources in Development, and reading from production data sources in Production.

Workspaces or Deployments run in a single Ascend Environment and are bound by the security limitations of that Environment. For example, a Workspace running in Development cannot access resources in Production. A Profile can be used to determine what those resources are. For example, the Profile for Development may specify the name of the Secrets Vault to use to get secrets, or it may specify the specific S3 bucket to read from.

Profiles and Environments

It is important to remember that while Profiles can determine what resources your Flows are connecting to, access to those resources is still constrained by the permissions granted to the Environment where your Workspace or Deployment is running.

Profiles at Runtime

Let's walk through how Profiles work in a simple example where we have a Project with a Snowflake Connection, and a Development and Production Profile.

my-project
├── connections
│ └── snowflake.yaml
├── flows
├── profiles
│ ├── dev-snowflake.yaml
│ └── prod-snowflake.yaml
└── vaults
└── vault.yaml

Within this Project, there is a snowflake.yaml Connection, where the values for the Connection are parameterized.

snowflake.yaml
connection:
snowflake:
account: snowflakeaccount
database: ${parameters.snowflake_database}
user: ${parameters.snowflake_user}
schema: ${parameters.snowflake_schema}
password: ${secret.vault.snowflake_password}
warehouse: ${parameters.snowflake_warehouse}

There is also an Azure Key Vault defined in vault.yaml, where the vault name is parameterized. This means that the same vault definition can be used for both development and production vaults.

vault.yaml
vault:
azure_key_vault:
vault_name: ${parameters.vault_name}

Finally, there are two Profiles, dev-snowflake.yaml and prod-snowflake.yaml. For both Profiles, Parameters have been set for the Snowflake Connection and the vault name. The default Data Plane has also been set to use the snowflake.yaml Connection in both Profiles.

dev-snowflake.yaml
profile:
parameters:
vault_name: dev-vault
snowflake_database: dev-db
snowflake_user: dev-user
snowflake_schema: dev-schema
snowflake_warehouse: dev-warehouse

defaults:
- kind: Flow
name:
regex: .*
spec:
data_plane:
connection_name: snowflake
prod-snowflake.yaml
profile:
parameters:
vault_name: prod-vault
snowflake_database: prod-db
snowflake_user: prod-user
snowflake_schema: prod-schema
snowflake_warehouse: prod-warehouse

defaults:
- kind: Flow
name:
regex: .*
spec:
data_plane:
connection_name: snowflake

During the Development phase, a Workspace is created that uses the dev-snowflake Profile. When a Flow is run in this Workspace, the following will happen:

  1. The parameters in the snowflake.yaml Connection are substituted for the parameter values in the dev-snowflake Profile. That means that the Snowflake database and schema that will be used will be dev-db and dev-schema, the Snowflake user will be dev-user, and the Snowflake warehouse will be dev-warehouse.
  2. The password for the Snowflake user will be retrieved from the dev-vault vault, as that is the value of the vault_name parameter in the dev-snowflake Profile.
  3. The Flow is then run using the snowflake Connection as the Data Plane.

When changes have been made, they can be pushed to the Production Deployment. This Deployment will use the prod-snowflake Profile, and the Flow will run with the parameters defined in that Profile.

Isolated Workspaces for Developers

When you have multiple developers working on the same Project in Ascend using the same Data Plane, it is important to ensure that they are not writing over each other's data when running their Flows. Using Profiles, you can parameterize the Data Plane Connection for each developer, so that, when they run a Flow, the data is written to a location that is isolated from other developers. In a Snowflake Data Plane, for example, this can be separate schemas for each user.

A typical profile in this scenario might look like this:

dev-snowflake-jane.yaml
profile:
parameters:
vault_name: dev-vault
snowflake_database: dev-db
snowflake_user: dev-user
snowflake_schema: jane-schema
snowflake_warehouse: dev-warehouse

defaults:
- kind: Flow
name:
regex: .*
spec:
data_plane:
connection_name: snowflake

In this example, when Jane runs a Flow in her Workspace using the dev-snowflake-jane Profile, the data will be written to an isolated schema in Snowflake.

Parameterized Components

While the above example shows how to parameterize Connections, you can also parameterize Components themselves. Any field in the YAML definition of a component can be parameterized. For example, you can limit the amount of data a Flow ingests by setting a parameter for the glob pattern used to ingest data from S3, like so:

s3-read-component.yaml
component:
read:
connection: lake_on_s3
s3:
path: listing/binary/
include:
- glob: ${parameter.s3_glob_pattern}

In your Profile, you can set the s3_glob_pattern to **/*.csv to ingest all CSV files in the listing/binary/ directory, or set it to **/year=2024/month=01/**/*.csv to only ingest a subset of the CSV files.

Conclusion

Profiles are a powerful feature in Ascend that allow you to parameterize your Flows and Components, and to customize how your Flows behave across different Environments. They are a key part of the Ascend development experience, and are used in a variety of ways to streamline the development process.