Skip to main content
Version: 3.0.0

Ascend Essentials

Course Introduction

Welcome to Ascend Essentials—the foundation of the Ascend Certification Program. In this short, practical course, you’ll build a simple, end-to-end data pipeline and get familiar with the key building blocks of Ascend: ingesting, transforming, and delivering data.

You’ll come away with a complete pipeline and a clear understanding of how to accomplish basic data engineering work within Ascend.

What You'll Learn

In this course, you'll gain practical experience with:

  • Core platform concepts - Understand Ascend's architecture and declarative approach
  • End-to-end pipeline building - Create a complete data pipeline from scratch
  • Data connections - Configure connections to various data sources
  • Transformations - Write SQL transformations to manipulate data
  • Plus, preview advanced capabilities including AI, smart data processing, and more!

This course provides the essential skills needed to begin working with Ascend immediately. The concepts and techniques learned here form the foundation for the more advanced courses in the certification program.

Completion Time: 1 hour

Prerequisites

Before we get started, let's make sure you have all the course requirements based on your data warehousing solution:

  • Browser: Google Chrome recommended for optimal experience (other modern browsers will work)
  • Email: An email account for registration and pipeline building sections
  • Google Account: Required for OAuth authentication
  • Snowflake Account: For computational processing and data storage

Once you've met these prerequisites, you'll be ready to complete all parts of the course.

What is Ascend?

Ascend is a unified data engineering platform that lets you build, automate, observe, and optimize all your data pipelines. With Ascend, you get an all-in-one solution for your data engineering needs.

At its core, Ascend eliminates manual workflow management by handling pipeline orchestration and scaling automatically. The platform intelligently tracks data lineage, detects changes, and only reprocesses what's necessary — reducing redundant computation and lowering cloud costs.

Unlike traditional ETL tools, Ascend uses a primarily declarative approach: you specify what your data should look like, and the platform determines the most efficient processing path. When needed, you can also use imperative approaches to customize configurations.

Ascend also enables data mesh architecture, moving data seamlessly across domains. By integrating with data warehousing solutions like Snowflake, Databricks, and Google BigQuery, Ascend creates an environment where data teams can collaborate effectively and scale their data architecture efficiently.

Setting Up

Ascend sits on top of your existing data stack. Select your data warehousing solution below to set up the Data Plane for your pipelines.

info

A Data Plane is the cloud environment where your data pipeline workloads run. In Ascend, you need at least one Data Plane but can use multiple to implement data mesh architecture.

By operating on top of your Data Plane, Ascend orchestrates storage and computation resources. For example, when using Snowflake as your Data Plane, any SQL transform in Ascend executes in Snowflake using your configured resources.

To learn more about Data Planes, visit: https://docs.ascend.io/concepts/data-plane

Using your Snowflake account from the Prerequisites section, you'll set up:

  1. Instance store
  2. Project
  3. Deployment
  4. Flow

To begin setup, navigate to the Settings menu by:

  1. Clicking your profile in the top right
  2. Selecting Settings

Follow the Snowflake quickstart.

danger

Pipeline Building: You must complete the quickstart guide to proceed with the pipeline building sections of this course.

Getting Ready to Climb

Let's put theory into practice! In this exercise, you’ll be working with a outdoor adventure company, where your guide Otto is prepping for an upcoming expedition. Some routes have been closed unexpectedly, and Otto needs help figuring out which ones to avoid.

image

Say hello to Otto!

While preparing for your expedition, Otto has discovered some route closures in the area. For safety reasons, you need to identify which routes to avoid during your journey.

The Data

Otto, quick on their hooves, has already compiled information about route closures to help you investigate. Here's a sample of the data:

route_closures.csv
Route IDStart dateEnd dateReason
OR-ROC2026-01-012026-12-31Construction
OHT_ROC2026-01-012026-02-20All staff will be on vacation
GML-ALP2026-02-012026-02-28Renovations

Otto suspects these closures might relate to recent weather conditions and wants to investigate possible correlations. Luckily, Otto maintains daily weather data that we can use:

weather.csv
timestamplocationtemperatureprecipitationwind_speedyearmonthday
2025-03-10 00:00:00New Joseph86.17892.599362.87022025310
2025-03-10 00:03:00Nielsenville84.888364.535241.42682025310
2025-03-10 00:04:30North Tina62.720591.159469.93822025310
2025-03-10 00:06:00West Joytown0.807752.287561.63062025310
2025-03-10 00:07:30Crystalville52.732723.05980.55362025310

The first table, route_closures.csv, shows us each route's ID and closure dates, but as beginner climbers, we can't easily identify these routes by ID alone. On the other hand, weather.csv shows when and where weather events occurred, but doesn't indicate which climbing routes might be affected.

Data Analysis

Our task is to combine these datasets so Otto can analyze potential correlations between weather patterns and route closures. We'll need to:

  1. Ingest both data files:
    • route_closures.csv
    • weather.csv
  2. Join the datasets
  3. Export the combined dataset for Otto's analysis

Creating Connections

To work with our data, we first need to bring both datasets into the same cloud environment. Currently, they exist in separate systems:

  • route_closures.csv is in an Amazon S3 bucket
  • weather.csv is in a Google Cloud Storage (GCS) bucket

To unify them, we’ll need to bring them into your data plane so we can work with them together. By creating Connections to each source and ingesting the data into your data plane, we can work with both datasets in the same place.

info

In Ascend, a Connection establishes a link between the platform and external data sources or destinations such as databases, data lakes, warehouses, or APIs.

Connections provide the configuration details needed to read from or write to these external systems, enabling seamless data movement across different storage locations.

To learn more about Connections, visit: https://docs.ascend.io/concepts/Connection

Build Activity

Let's start by creating a Connection to S3. First, let's familiarize ourselves with the interface:

image

Figure 1: The Workspace UI with the Build activity panel open

We're currently in the Build activity panel (see box 1 in the image above), which is the default view when entering a Workspace. There are four key areas to note:

  1. Build activity: Provides a high-level overview of your workspace components, including flows, automations, and Connections.
  2. Files activity: Displays a file system view of all workspace files, offering a more detailed view of your project structure.
  3. Source Control activity: Shows a historical log of changes recorded in your branch's source control.
  4. Connections: Lists all available Connections that your workspace can use to interact with external systems.
tip

The Build activity panel is your primary tool for running and viewing flows. After building a project, this panel displays all flows with summary information about successful builds. Use it when navigating through flows or running and testing pipelines.

Click the + sign next to the Connections section (box 4 in the image above) to open the form for creating a new Connection.

Create a Connection using Forms

You should now see a form like this:

image

Figure 2: A new Connection form

tip

Forms provide a user-friendly way to configure components without manually formatting YAML or Python files. They offer quick configuration with standardized fields, although with less customization than direct code editing.

Let's complete the form:

  1. Enter a name for the Connection: s3_public_read. This name will be used to reference the Connection throughout your project.

  2. Add a description by clicking Add a Connection description and entering: Reading in public route closure data.

  3. Configure the Connection to point to the S3 bucket containing our route closure data:

    • Connection Type: Select S3 from the dropdown menu.
    • Root: Enter s3://ascend-io-sample-data-read/ to specify which bucket to access.

Your completed form should look like this:

image

Figure 3: The completed S3 Connection form

Before proceeding, click Test Read Connection at the bottom of the form to verify that your Connection works properly. A successful test will display:

image

Figure 4: A successful S3 Connection test

warning

If your test fails, verify that you've entered the Connection Name correctly and selected the proper Connection Type and Root. For this particular bucket, the other fields (Region, Access Key ID, etc.) aren't required. Failed tests will prevent you from successfully ingesting data later.

🎉 Congratulations! You've created your first Connection.

Note that we've only established a Connection to S3, not to GCS, and we haven't yet ingested any data—we've simply created the capability to do so. Let's continue building our pipeline. ⬇️

Files Activity

In the previous section, we created an S3 Connection using the Build activity panel and forms. Now, let's create a GCS Connection using a more direct code approach.

Navigate to the Files activity by clicking the Files tab at the top of the left sidebar (see figure 1, box 2). You'll see a view like this:

image

Figure 5: The Workspace UI with the Files activity open

tip

Like the Build activity panel, the Files activity panel shows all flows in your project, but in greater detail. This panel allows you to edit, manage, and organize files more efficiently, making it the preferred interface during active development.

Now that we're in the Files activity (note that "Files" is highlighted instead of "Build" in the top left), let's identify the key elements:

  1. New Folder: Creates a new folder within your currently selected directory (Box 1)
  2. New File: Creates a new file within your currently selected directory (Box 2)
  3. Sync Files: Refreshes the panel to show the latest state of files in your source control repository (Box 3)
  4. Connections: Displays all Connection files available to your workspace (Box 4)
    • Note that Connections is highlighted, indicating it's your current working directory

Click on the Connections folder to expand it (if it's not yet expanded). You'll know the folder is selected when it's highlighted.

Create a Connection using Code

Click New File (see figure 5, box 2). When a prompt appears asking for a file name. Enter gcs_public_read.yaml, following the same naming convention as our S3 Connection. You should see:

image

warning

When creating a file directly, you must include the .yaml extension in the filename. This differs from forms, where the extension is added automatically.

Press Enter or click elsewhere on the screen to create the file. The Code Editor will open automatically:

image

Figure 6: The Code Editor view for gcs_public_read.yaml

Notice in the top left that you're editing the file you just created. Since it's currently empty, copy and paste the following YAML code into the editor:

connection:
gcs:
root: gs://ascend-io-sample-data/

After pasting, your editor should look like this:

image

info

Ascend is a Flex-code platform, supporting both low-code and code-first approaches. While forms provide an accessible interface, coding gives you maximum flexibility and control over component configuration.

Click the Save button in the top right corner (or use the Cmd/Ctrl + S keyboard shortcut). You've now configured your second Connection!

Let's make sure to test this one to ensure it works correctly, like we did with the S3 connection. Click the Configure tab in the top right to access the form view of your Connection, which has been automatically populated based on your code. Click Test Read Connection and verify that all tests pass.

tip

Regardless of how you initially create a component — through forms or code — you can always use both the no-code and code-first views later.

For example, you can edit your S3 Connection using the Code Editor, even though you created it with a form.

Ingesting Data

Now that our Connections are established, we're ready to bring the data into our Data Plane for analysis. While Connections enable access to external systems, they don't automatically import data. To ingest, transform, and export data, we need to create Components within our pipeline.

info

In Ascend, Components are the fundamental building blocks of data pipelines that define the sequence and logic of data flow. Depending on their type, Components are configured using either YAML, Python, or SQL files. These Components handle data ingestion, transformation, and output throughout the Ascend platform.

To learn more about Components and their types, visit: https://docs.ascend.io/concepts/component

Let's set up our flow structure to organize our Components.

Create Flow Structure

Picking up where we left off in the Files activity, let's shift focus from the Connections folder to the Flows folder. A typical flow structure includes:

image

Figure 7: The file view of a Flow

  1. Flows folder: The top-level container for all files related to the flow
  2. Components folder: Container for all component files used in the flow
  3. Components: Individual files that each correspond to a single Component
  4. Flow definition file: Defines essential flow properties including name, version number, and runtime parameters (required for every flow)

Now, let's create our flow structure:

  1. Navigate back to Build Activity, the original view we began with in the left sidebar.

  2. Click the + symbol to the right of Flows. This will pop open the New Flow Form.

  3. The only field we will need to fill out for this particular flow will be the flow name. We'll call this flow essentials. The other fields in the Flow Creation Form will be filled with default values.

  4. Save the new flow by clicking the Save button at the bottom of the form.

  5. After saving, this should open the Flow Definition file for our newly created essentials flow. When the code editor opens, replace the provided code with the following code:

flow:
name: essentials
version: null
tip

Note that we're providing a name for the flow in the code. This internal name is how Ascend will reference the flow in its operations. While the folder name and definition file name are for user organization, this internal name is what matters to the system. In practice, these names are usually kept identical for clarity.

Hit Save and then close the code editor to finalize the creation of our essentials flow. If you now take a look at Build Activity, you will see there is a new flow, essentials, which has 0 components (which makes sense since we just made it). If you navigate back to the Files view and expand the Flows directory, you'll see there is also a new folder called essentials. When expanded, you should now have a structure like this:

image

Figure 8: The basic structure of a Flow

Create an S3 Read Component

With our flow structure in place, we can create our first Read Component to ingest data from S3.

info

A Read Component in Ascend imports data from external sources into your data plane. It intelligently handles schema changes, tracks data updates, and efficiently ingests only new or modified records to minimize processing overhead. This ensures that downstream operations always work with current data without manual intervention.

To learn more about Read Components, visit: https://docs.ascend.io/reference/resource/component/read/

Right-click on the components folder and select Create New File. Name it s3_read.yaml and paste the following code into the editor:

component:
name: route_closures
read:
connection: s3_public_read
s3:
path: /essentials/
include:
- glob: 'route_closures.csv'
parser: auto

Save the file after pasting the code. Let's examine each part of this configuration:

  1. name: Defines the component's identifier. It is used to reference this component in downstream components
  2. read: Specifies that this is a Read Component (as opposed to transform, write, or other component types)
  3. Connection: References the specific Connection to use (in this case, the S3 Connection we just created)
tip

Specifying the Connection becomes particularly important as you add more Connections of the same type (e.g., multiple S3 Connections) to your Ascend instance.

  1. s3: Indicates the connection type, which determines what parameters Ascend expects
  2. path: Specifies which directory to search for files, relative to the Connection's root. The full path here would be s3://ascend-io-sample-data-read/essentials/
tip

Using path in combination with glob patterns helps narrow your search scope to specific directories, making it easier to target particular files or tables that might share naming conventions with files in other locations.

  1. include: Tells Ascend to ingest only the specified files (as opposed to exclude, which would ingest everything except the specified files)
  2. glob: Defines the pattern for matching files, in this case, route_closures.csv
  3. parser: Determines how files are parsed; 'auto' automatically selects the appropriate parser based on file type, but you can explicitly specify formats like CSV, JSON, etc.
tip

For more details on these configuration options, refer to our reference documentation: https://docs.ascend.io/reference/resource/

For S3 Read Components specifically, see: https://docs.ascend.io/reference/resource/component/read/read-s3

You just created your first Read Component! In simple terms, this component will ingest the route_closures.csv file from the essentials folder in the ascend-io-sample-data-read bucket.

Creating a GCS Read Component

Now, we want to create a similar component to ingest data from GCS. Right-click on the Components folder again and select Create a New File. Name it gcs_read.yaml and paste the following code:

component:
name: weather
read:
connection: gcs_public_read
gcs:
path: /essentials/
include:
- glob: 'weather.csv'
parser: auto

Although this configuration is similar to our S3 component, note these key differences:

  1. Connection: References our gcs_public_read Connection instead of s3_public_read
  2. GCS vs. S3: Specifies that this is a GCS type component rather than S3
warning

It's essential to use the correct Connection type and parameters. Using an S3 configuration with a GCS Connection would cause errors since the systems expect different parameters.

Save this file.

🎉 Congratulations – you've successfully set up components to ingest data from both S3 and GCS within the same flow!

Transforming Data

With our data now imported into the Data Plane, we can transform it for our analysis. Transformations in Ascend can range from simple joins to complex data preparation for machine learning models. All these operations are handled through Transform Components.

info

In Ascend, a Transform Component processes and manipulates data within a flow. You can use SQL or Python to apply business logic, cleanse, reshape, and aggregate your data. Transforms intelligently track dependencies and only recompute when upstream data changes, ensuring both efficiency and accuracy. The platform offers various implementation strategies to accommodate different transformation requirements.

To learn more about Transform Components, visit: https://docs.ascend.io/concepts/transform

Our current task is straightforward: we need to combine our two datasets to identify potential correlations between route closures and weather conditions.

To create this Transform, right-click on the Components folder and create a new file named basic_join.sql.jinja.

warning

Make sure to include the complete .sql.jinja extension. Using just .sql or .jinja will prevent the system from properly recognizing the file.

For this task, we can leverage Ascend's AI assistant, Otto. Look for the assistant icon in the top header bar:

image

Figure 9: Otto, the helpful AI Assistant

Click the icon to open the chat sidebar. Otto can help generate SQL code based on your requirements. Try entering the following prompt:

Hey Otto! For the basic_join.sql.jinja transform component, I'd like to join the route_closures Read Component and the weather Read Component on the id column. I want this to be an inner join. Can you write the code and add it to the file please?

After sending your message, Otto should respond with SQL code similar to:

SELECT
rc.*,
w.*
FROM {{ ref('route_closures') }} AS rc
JOIN {{ ref('weather') }} AS w ON rc.id = w.id
warning

The exact code might vary slightly. If Otto's suggestion doesn't match what you need, feel free to use the code provided above instead, but first review Otto's response to see if meets your requirements.

Copy the SQL code into your basic_join.sql.jinja file and click Save.

🎉 Congratulations! You've created your first Transform Component and established the foundation of your ETL pipeline in Ascend.

Writing Data Out

Our final step is to export the joined dataset to an external system where Otto can perform further analysis. In Ascend, this is accomplished using a Write Component.

info

A Write Component exports processed data from your flow to external systems or storage locations. These components support various destinations including cloud storage, databases, and data warehouses, ensuring that downstream systems receive your transformed data reliably and in the appropriate format.

To learn more about Write Components, visit: https://docs.ascend.io/concepts/write

Right-click on the Components folder, select Create a New File, and name it snowflake_write.yaml.

tip

Both Read and Write Components (referred to as Connection type components) use YAML configuration files. This consistent approach makes them straightforward to create and maintain.

Paste the following YAML configuration to set up a Write Component that exports our data to a Snowflake table:

component:
name: snowflake_write
write:
Connection: snowflake_data_plane
snowflake:
table:
name: bad_routes
schema: ascend_data

Let's highlight the key differences from our Read Components:

  1. Write vs. read: Line 3 specifies that this is a Write Component rather than a Read Component
  2. Connection: We're using our Data Plane's Snowflake Connection to write data to a table named bad_routes

Save your file, and you've successfully created a Write Component.

🎉Congratulations – you've completed the development of your first end-to-end pipeline in Ascend!

Running a Flow

We've created all the component files for our flow, but we still need to execute it and see it in action. Let's return to the Build Summary Panel by clicking the Build tab in the top left of the interface.

image

Figure 10: The Build Summary Panel

You'll see a button labeled Run Flow. Clicking this button performs two actions:

  1. Build the project: This formalizes all changes made since your last build and updates the UI to reflect these changes.
  2. Run the pipeline: This executes your flow, allowing you to observe each component as it runs and completes in the UI.
info

In Ascend, a Flow Run represents a single execution of your flow. Each run processes data according to your defined components and their dependencies.

To learn more about building and running flows, visit: https://docs.ascend.io/concepts/flow-run

Course Wrap-Up 🎉

You’ve just built and run your first complete pipeline in Ascend—from connecting to external data sources to transforming and exporting your results. Along the way, you learned how to:

  • Create and manage Connections to external systems like S3 and GCS
  • Ingest data using Read Components
  • Write SQL transforms with help from Otto, Ascend’s AI assistant
  • Export data using Write Components
  • Structure and run a Flow in your Workspace

These foundational skills will help you confidently build more advanced pipelines—and set you up for the next stage of the Ascend Certification Program.