Skip to main content
Version: 3.0.0

Concepts

Ascend is a cloud-based, unified data engineering platform that simplifies the development, deployment, and management of data pipelines enabled through data flows and automations. By providing a comprehensive suite of tools and functionality, Ascend enables users to build scalable, reliable, secure and efficient data flows that serve data-driven business operations, insights and innovation.

To get started understanding the core concepts of Ascend, it is helpful to:

  • understand the words that define the actions that inspire how you work in Ascend;
  • explore the platform in a logical sequence, starting with the foundational building blocks you will use every day; and,
  • proceed on to more advanced topics.
tip

We recommend that you go through the Quickstart before diving into the conceptual guide. This will provide practical context that will make it easier to understand the concepts discussed here.

Foundational Concepts

Before diving into Ascend-specific concepts, it's helpful to understand some foundational data engineering principles:

Language of the Data Engineer

tip

What are the basic concepts/words you already use every day when you work with data pipelines?

When you work with data, do you:

  • create projects for your data pipelines to live in?
  • connect to systems to read and write data?
  • transform your data using SQL and/or Python to produce valuable outcomes for your organization?
  • accomplish a task by running arbitrary code in a sequence or graph?
  • connect things together into a sequence or flow of events that must happen in a certain order?

If any of the above sound similar to what you do every day, you'll be right at home with Ascend. Our language is derived from the basic concepts you already know and use.

Data Pipeline and Data Flow

A data pipeline puts together all the pieces to ensure that data flows in systems that connect people, processes, software and machines.

As you know, data pipelines often include much more than just data, such as business process/workflow-like logic, complex rules for orchestration/automation, scheduling, event-driven behaviors, and more.

tip

You will often see the phrase "data pipeline" and "data flow" used interchangeably in our industry. At Ascend, we choose to distinguish between "data pipelines" and "data flows" to simplify and clarify the value and most important concept - flow.

A data pipeline describes the thing you assemble from all the machinery you put together. It is correct to say you build data pipelines with Ascend. It is even more powerful to say that you enable data and process flow in (and between) your organizations with Ascend.

Flow is the most important and ultimate function of data pipelines.

Declarative Programming

Ascend is a declarative programming platform that allows users to combine simple, no-to-low code configurations defined in YAML files, with code written in Python, SQL, or other languages to create complex data processing flows.

A fundamental aspect of Declarative Programming deals with "simple building blocks" that we often refer to as primitives in Ascend. As you read on, you will see that these primitives form a simple language that you can use to define your data flows and automations quickly and easily. With Ascend, you entrust the complex orchestration, automation and execution tasks to the platform, and focus your effort on the tasks of highest value - your business logic, process flow and transformation code.

You will learn that this is the ultimate goal of declarative programming and Ascend in general - describe what you want the system to do in easy terms, write only the code you need to write, and let advanced automation and the platform handle the rest.

Flex Code

Ascend refers to the concept of Flex Code to describe writing "just the right amount" of SQL or Python code to achieve the desired functionality for your flow and/or automation.

Ascend is not a visual drag-and-drop builder tool, by design. Ascend is designed for analytics, data and AI/ML/LLM engineers, data scientists and other technical users who know how to perform basic coding tasks in SQL and Python.

Basic Ascend Resources

Ascend's core primitives are referred to as resources, and are the building blocks that you will use to build your flows every day.

Project

A Project organizes all related artifacts for a specific data engineering initiative. It acts as a container for connections, flows, components, code, configurations, and more, providing a structured environment for development, testing, and deployment.

Connection

A Connection defines the configuration of connection details and credentials required to interact with an external data source or destination. Connections are used to access, explore, read and write data from/to external systems, such as databases, cloud storage, APIs, and more.

Flow

A Flow is a single unit of execution — a data processing flow that defines the sequence of operations required to read, transform, and/or write data, or even execute arbitrary code as a task. Flows are composed of Components, which represent individual processing steps.

A flow is represented as a directed acyclic graph (DAG) that defines the dependencies between components, and the order in which they should be executed. This is a fancy way of saying, some steps may run in parallel, some steps may run in sequence, and some steps may run only after other steps have completed. Typical data/control flow in data pipelines never runs in a loop/cycle, and this is why a DAG representation is used.

Component

A Component is a reusable building block that encapsulates a specific data processing operation, such as reading data from a source, transforming data, or writing data to a destination. Components can be used within Flows to define the processing logic and dependencies of a data pipeline.

There are several types and subtypes of components in Ascend. The basic component types include:

  • Read: Read data from a source
  • Transform: Transform data
  • Write: Write data to a destination
  • Task: Execute arbitrary code

In addition, there are multiple advanced component types that you can use to extend the functionality of your flows and increase reusability. These include:

Data Plane

A Data Plane is the underlying infrastructure where data storage, processing, and computation occur. It's the execution environment for the component logic of your Flows. Ascend supports various Data Planes (Snowflake, Databricks, BigQuery, etc.), allowing you to choose the best fit for your needs.

tip

It is strongly recommended that you set up a separate Data Plane for your Development, Staging, and Production environments, and keep these isolated from the Instance Store.

Read Infrastructure Concepts for more information on how Ascend manages infrastructure resources.

Advanced Ascend Resources

Automation

An Automation streamlines your data lifecycle by triggering Flow runs based on events (e.g., Flow completion, schedules) or conditions.

Profile

A Profile defines runtime parameters and default configurations for your Flows. Profiles enable you to reuse Flows across different Workspaces and Deployments by utilizing customizable parameters for connections, components, and vaults.

Infrastructure Concepts

Infrastructure resources are those that support the execution of your Flows. These resources are managed by Ascend, and can be configured as part of your Project.

Instance

An Instance is a managed resource that is dedicated to you and your team. Your instance can be configured to either run in Ascend Cloud, or run on your own infrastructure for greater control.

Instance Store

An Instance Store is where all of your Ascend metadata and history is stored. This includes builds, runs, and more. Ascend supports various Instance Stores, including Snowflake, BigQuery, and more.

Git Repository

Ascend supports integration with Git for version control, branching, and collaboration. A single Git repository can have one or more Ascend projects associated with it.

Environment

An Environment isolates resources for different stages of the software development lifecycle (e.g., development, staging, production). They provide separate security boundaries and allow you to manage resources specific to each stage.

Workspace

A Workspace is your isolated, containerized development environment within Ascend. It provides an IDE-like interface for building and running Projects & Flows, integrated with version control and CI/CD. A workspace runs within a single environment, and is bound by the security limitations of that environment. Workspace are designed to be used by a single developer at a time, as they are configured with a single project and branch at a time, and are the physical resource that executes code interactively.

Deployment

A Deployment is a runtime environment for running Flows in a production-like setting. Deployments support automations and scheduled Flows. Every deployment runs within a single environment, and is bound by the security limitations of that environment.

Vault

A Vault securely stores sensitive information (passwords, tokens, keys) used by your Connections and Components. Each Environment you create has an Ascend-managed Vault associated with it. In addition, Ascend supports various external vault implementations, AWS Secrets Manager, Azure Key Vault, and Google Cloud Secret Manager. You can grant access to your external vault for a specific Environment via your cloud provider's IAM roles, which allows the Workspaces and Deployments running within that Environment to access the vault.