Profiles
Profiles customize how Projects and Flows behave across different Environments (e.g., Development, Staging, Production). They manage settings that influence data processing, connection handling, and resource allocation.
- Environment Customization: Adjust settings for different environments, ensuring consistent behavior without altering the codebase.
- Operational Flexibility: Maintain a single codebase while adapting parameters to fit specific stages or conditions, reducing duplication and enhancing efficiency.
Benefits
- Consistency: Centralized settings ensure predictable workflow behavior across environments.
- Flexibility: Adjust workflows quickly without redeploying or modifying core components.
- Simplified Management: Consolidate configuration details, making maintenance and updates easier.
How Profiles Work
There are two main types of settings defined in your Profile:
-
Default Values: Profiles can specify default settings for each Flow in a project. For example, you can specify the default Connection to use as your Data Plane for all flows in a project. If you'd like to use specific Data Plane Connections for specific Flows, you can do so using regular expressions. Each default configuration has a kind (which determines the kind of resource being configured), a name (which is used to match the resource to be configured), and a spec (which contains the configuration for the resource).
-
Runtime Parameters: These are used to adapt the runtime behavior of your Flow. Parameters are defined as key-value pairs within your Profile, under the
parameters
field.
Here is an example of a Profile with both Parameters and Defaults. In this example, Snowflake Connection parameters are set for a development Snowflake environment and the default Data Plane for all Flows in this Profile is set to the snowflake
Connection.
profile:
parameters:
vault_name: dev-vault
snowflake_database: dev-db
snowflake_user: dev-user
snowflake_schema: dev-schema
snowflake_warehouse: dev-warehouse
defaults:
- kind: Flow
name:
regex: .*
spec:
data_plane:
connection_name: snowflake
Use Cases
CI/CD with Profiles
Profiles allow for seamless promotion of your Flows from your Ascend Development Environment to your Staging or Production Environment as you can create separate Profiles for each stage in the Software Development Lifecycle (SDLC). For example, the most common scenario is using a Profile to parameterize a Connection, so you can ensure you are reading from development data sources in Development, and reading from production data sources in Production.
Workspaces or Deployments run in a single Ascend Environment and are bound by the security limitations of that Environment. For example, a Workspace running in Development cannot access resources in Production. A Profile can be used to determine what those resources are. For example, the Profile for Development may specify the name of the Secrets Vault to use to get secrets, or it may specify the specific S3 bucket to read from.
It is important to remember that while Profiles can determine what resources your Flows are connecting to, access to those resources is still constrained by the permissions granted to the Environment where your Workspace or Deployment is running.
Profiles at Runtime
Let's walk through how Profiles work in a simple example where we have a Project with a Snowflake Connection, and a Development and Production Profile.
my-project
├── connections
│ └── snowflake.yaml
├── flows
├── profiles
│ ├── dev-snowflake.yaml
│ └── prod-snowflake.yaml
└── vaults
└── vault.yaml
Within this Project, there is a snowflake.yaml
Connection, where the values for the Connection are parameterized.
connection:
snowflake:
account: snowflakeaccount
database: ${parameters.snowflake_database}
user: ${parameters.snowflake_user}
schema: ${parameters.snowflake_schema}
password: ${secret.vault.snowflake_password}
warehouse: ${parameters.snowflake_warehouse}
There is also an Azure Key Vault defined in vault.yaml
, where the vault name is parameterized. This means that the same vault definition can be used for both development and production vaults.
vault:
azure_key_vault:
vault_name: ${parameters.vault_name}
Finally, there are two Profiles, dev-snowflake.yaml
and prod-snowflake.yaml
. For both Profiles, Parameters have been set for the Snowflake Connection and the vault name. The default Data Plane has also been set to use the snowflake.yaml
Connection in both Profiles.
profile:
parameters:
vault_name: dev-vault
snowflake_database: dev-db
snowflake_user: dev-user
snowflake_schema: dev-schema
snowflake_warehouse: dev-warehouse
defaults:
- kind: Flow
name:
regex: .*
spec:
data_plane:
connection_name: snowflake
profile:
parameters:
vault_name: prod-vault
snowflake_database: prod-db
snowflake_user: prod-user
snowflake_schema: prod-schema
snowflake_warehouse: prod-warehouse
defaults:
- kind: Flow
name:
regex: .*
spec:
data_plane:
connection_name: snowflake
During the Development phase, a Workspace is created that uses the dev-snowflake
Profile. When a Flow is run in this Workspace, the following will happen:
- The parameters in the
snowflake.yaml
Connection are substituted for the parameter values in thedev-snowflake
Profile. That means that the Snowflake database and schema that will be used will bedev-db
anddev-schema
, the Snowflake user will bedev-user
, and the Snowflake warehouse will bedev-warehouse
. - The password for the Snowflake user will be retrieved from the
dev-vault
vault, as that is the value of thevault_name
parameter in thedev-snowflake
Profile. - The Flow is then run using the
snowflake
Connection as the Data Plane.
When changes have been made, they can be pushed to the Production Deployment. This Deployment will use the prod-snowflake
Profile, and the Flow will run with the parameters defined in that Profile.
Isolated Workspaces for Developers
When you have multiple developers working on the same Project in Ascend using the same Data Plane, it is important to ensure that they are not writing over each other's data when running their Flows. Using Profiles, you can parameterize the Data Plane Connection for each developer, so that, when they run a Flow, the data is written to a location that is isolated from other developers. In a Snowflake Data Plane, for example, this can be separate schemas for each user.
A typical profile in this scenario might look like this:
profile:
parameters:
vault_name: dev-vault
snowflake_database: dev-db
snowflake_user: dev-user
snowflake_schema: jane-schema
snowflake_warehouse: dev-warehouse
defaults:
- kind: Flow
name:
regex: .*
spec:
data_plane:
connection_name: snowflake
In this example, when Jane runs a Flow in her Workspace using the dev-snowflake-jane
Profile, the data will be written to an isolated schema in Snowflake.
Parameterized Components
While the above example shows how to parameterize Connections, you can also parameterize Components themselves. Any field in the YAML definition of a component can be parameterized. For example, you can limit the amount of data a Flow ingests by setting a parameter for the glob
pattern used to ingest data from S3, like so:
component:
read:
connection: lake_on_s3
s3:
path: listing/binary/
include:
- glob: ${parameter.s3_glob_pattern}
In your Profile, you can set the s3_glob_pattern
to **/*.csv
to ingest all CSV files in the listing/binary/
directory, or set it to **/year=2024/month=01/**/*.csv
to only ingest a subset of the CSV files.
Conclusion
Profiles are a powerful feature in Ascend that allow you to parameterize your Flows and Components, and to customize how your Flows behave across different Environments. They are a key part of the Ascend development experience, and are used in a variety of ways to streamline the development process.