Ascend Essentials
Course Introduction
Welcome to Ascend Essentials—the foundation of the Ascend Certification Program. In this short, practical course, you’ll build a simple, end-to-end data pipeline and get familiar with the key building blocks of Ascend: ingesting, transforming, and delivering data.
You’ll come away with a complete pipeline and a clear understanding of how to accomplish basic data engineering work within Ascend.
What You'll Learn
In this course, you'll gain practical experience with:
- Core platform concepts - Understand Ascend's architecture and declarative approach
- End-to-end pipeline building - Create a complete data pipeline from scratch
- Data connections - Configure connections to various data sources
- Transformations - Write SQL transformations to manipulate data
- Plus, preview advanced capabilities including AI, smart data processing, and more!
This course provides the essential skills needed to begin working with Ascend immediately. The concepts and techniques learned here form the foundation for the more advanced courses in the certification program.
Completion Time: 1 hour
Prerequisites
Before we get started, let's make sure you have all the course requirements based on your data warehousing solution:
- Snowflake
- Browser: Google Chrome recommended for optimal experience (other modern browsers will work)
- Email: An email account for registration and pipeline building sections
- Google Account: Required for OAuth authentication
- Snowflake Account: For computational processing and data storage
Once you've met these prerequisites, you'll be ready to complete all parts of the course.
What is Ascend?
Ascend is a unified data engineering platform that lets you build, automate, observe, and optimize all your data pipelines. With Ascend, you get an all-in-one solution for your data engineering needs.
At its core, Ascend eliminates manual workflow management by handling pipeline orchestration and scaling automatically. The platform intelligently tracks data lineage, detects changes, and only reprocesses what's necessary — reducing redundant computation and lowering cloud costs.
Unlike traditional ETL tools, Ascend uses a primarily declarative approach: you specify what your data should look like, and the platform determines the most efficient processing path. When needed, you can also use imperative approaches to customize configurations.
Ascend also enables data mesh architecture, moving data seamlessly across domains. By integrating with data warehousing solutions like Snowflake, Databricks, and Google BigQuery, Ascend creates an environment where data teams can collaborate effectively and scale their data architecture efficiently.
Setting Up
Ascend sits on top of your existing data stack. Select your data warehousing solution below to set up the Data Plane for your pipelines.
A Data Plane is the cloud environment where your data pipeline workloads run. In Ascend, you need at least one Data Plane but can use multiple to implement data mesh architecture.
By operating on top of your Data Plane, Ascend orchestrates storage and computation resources. For example, when using Snowflake as your Data Plane, any SQL transform in Ascend executes in Snowflake using your configured resources.
To learn more about Data Planes, visit: https://docs.ascend.io/concepts/data-plane
- Snowflake
Using your Snowflake account from the Prerequisites section, you'll set up:
- Instance store
- Project
- Deployment
- Flow
To begin setup, navigate to the Settings menu by:
- Clicking your profile in the top right
- Selecting Settings
Follow the Snowflake quickstart.
Pipeline Building: You must complete the quickstart guide to proceed with the pipeline building sections of this course.
Getting Ready to Climb
Let's put theory into practice! In this exercise, you’ll be working with a outdoor adventure company, where your guide Otto is prepping for an upcoming expedition. Some routes have been closed unexpectedly, and Otto needs help figuring out which ones to avoid.
While preparing for your expedition, Otto has discovered some route closures in the area. For safety reasons, you need to identify which routes to avoid during your journey.
The Data
Otto, quick on their hooves, has already compiled information about route closures to help you investigate. Here's a sample of the data:
Route ID | Start date | End date | Reason |
---|---|---|---|
OR-ROC | 2026-01-01 | 2026-12-31 | Construction |
OHT_ROC | 2026-01-01 | 2026-02-20 | All staff will be on vacation |
GML-ALP | 2026-02-01 | 2026-02-28 | Renovations |
Otto suspects these closures might relate to recent weather conditions and wants to investigate possible correlations. Luckily, Otto maintains daily weather data that we can use:
timestamp | location | temperature | precipitation | wind_speed | year | month | day |
---|---|---|---|---|---|---|---|
2025-03-10 00:00:00 | New Joseph | 86.1789 | 2.5993 | 62.8702 | 2025 | 3 | 10 |
2025-03-10 00:03:00 | Nielsenville | 84.8883 | 64.5352 | 41.4268 | 2025 | 3 | 10 |
2025-03-10 00:04:30 | North Tina | 62.7205 | 91.1594 | 69.9382 | 2025 | 3 | 10 |
2025-03-10 00:06:00 | West Joytown | 0.8077 | 52.2875 | 61.6306 | 2025 | 3 | 10 |
2025-03-10 00:07:30 | Crystalville | 52.7327 | 23.0598 | 0.5536 | 2025 | 3 | 10 |
The first table, route_closures.csv, shows us each route's ID and closure dates, but as beginner climbers, we can't easily identify these routes by ID alone. On the other hand, weather.csv shows when and where weather events occurred, but doesn't indicate which climbing routes might be affected.
Data Analysis
Our task is to combine these datasets so Otto can analyze potential correlations between weather patterns and route closures. We'll need to:
- Ingest both data files:
- route_closures.csv
- weather.csv
- Join the datasets
- Export the combined dataset for Otto's analysis
Creating Connections
To work with our data, we first need to bring both datasets into the same cloud environment. Currently, they exist in separate systems:
- route_closures.csv is in an Amazon S3 bucket
- weather.csv is in a Google Cloud Storage (GCS) bucket
To unify them, we’ll need to bring them into your data plane so we can work with them together. By creating Connections to each source and ingesting the data into your data plane, we can work with both datasets in the same place.
In Ascend, a Connection establishes a link between the platform and external data sources or destinations such as databases, data lakes, warehouses, or APIs.
Connections provide the configuration details needed to read from or write to these external systems, enabling seamless data movement across different storage locations.
To learn more about Connections, visit: https://docs.ascend.io/concepts/Connection
Build Activity
Let's start by creating a Connection to S3. First, let's familiarize ourselves with the interface:
We're currently in the Build activity panel (see box 1 in the image above), which is the default view when entering a Workspace. There are four key areas to note:
- Build activity: Provides a high-level overview of your workspace components, including flows, automations, and Connections.
- Files activity: Displays a file system view of all workspace files, offering a more detailed view of your project structure.
- Source Control activity: Shows a historical log of changes recorded in your branch's source control.
- Connections: Lists all available Connections that your workspace can use to interact with external systems.
The Build activity panel is your primary tool for running and viewing flows. After building a project, this panel displays all flows with summary information about successful builds. Use it when navigating through flows or running and testing pipelines.
Click the + sign next to the Connections section (box 4 in the image above) to open the form for creating a new Connection.
Create a Connection using Forms
You should now see a form like this:
Forms provide a user-friendly way to configure components without manually formatting YAML or Python files. They offer quick configuration with standardized fields, although with less customization than direct code editing.
Let's complete the form:
-
Enter a name for the Connection:
s3_public_read
. This name will be used to reference the Connection throughout your project. -
Add a description by clicking Add a Connection description and entering:
Reading in public route closure data
. -
Configure the Connection to point to the S3 bucket containing our route closure data:
- Connection Type: Select S3 from the dropdown menu.
- Root: Enter
s3://ascend-io-sample-data-read/
to specify which bucket to access.
Your completed form should look like this:
Before proceeding, click Test Read Connection at the bottom of the form to verify that your Connection works properly. A successful test will display:
If your test fails, verify that you've entered the Connection Name correctly and selected the proper Connection Type and Root. For this particular bucket, the other fields (Region, Access Key ID, etc.) aren't required. Failed tests will prevent you from successfully ingesting data later.
🎉 Congratulations! You've created your first Connection.
Note that we've only established a Connection to S3, not to GCS, and we haven't yet ingested any data—we've simply created the capability to do so. Let's continue building our pipeline. ⬇️
Files Activity
In the previous section, we created an S3 Connection using the Build activity panel and forms. Now, let's create a GCS Connection using a more direct code approach.
Navigate to the Files activity by clicking the Files tab at the top of the left sidebar (see figure 1, box 2). You'll see a view like this:
Like the Build activity panel, the Files activity panel shows all flows in your project, but in greater detail. This panel allows you to edit, manage, and organize files more efficiently, making it the preferred interface during active development.
Now that we're in the Files activity (note that "Files" is highlighted instead of "Build" in the top left), let's identify the key elements:
- New Folder: Creates a new folder within your currently selected directory (Box 1)
- New File: Creates a new file within your currently selected directory (Box 2)
- Sync Files: Refreshes the panel to show the latest state of files in your source control repository (Box 3)
- Connections: Displays all Connection files available to your workspace (Box 4)
- Note that Connections is highlighted, indicating it's your current working directory
Click on the Connections folder to expand it (if it's not yet expanded). You'll know the folder is selected when it's highlighted.
Create a Connection using Code
Click New File (see figure 5, box 2). When a prompt appears asking for a file name. Enter gcs_public_read.yaml
, following the same naming convention as our S3 Connection. You should see:
When creating a file directly, you must include the .yaml
extension in the filename. This differs from forms, where the extension is added automatically.
Press Enter or click elsewhere on the screen to create the file. The Code Editor will open automatically:
Notice in the top left that you're editing the file you just created. Since it's currently empty, copy and paste the following YAML code into the editor:
connection:
gcs:
root: gs://ascend-io-sample-data/
After pasting, your editor should look like this:
Ascend is a Flex-code platform, supporting both low-code and code-first approaches. While forms provide an accessible interface, coding gives you maximum flexibility and control over component configuration.
Click the Save button in the top right corner (or use the Cmd/Ctrl + S
keyboard shortcut). You've now configured your second Connection!
Let's make sure to test this one to ensure it works correctly, like we did with the S3 connection. Click the Configure tab in the top right to access the form view of your Connection, which has been automatically populated based on your code. Click Test Read Connection and verify that all tests pass.
Regardless of how you initially create a component — through forms or code — you can always use both the no-code and code-first views later.
For example, you can edit your S3 Connection using the Code Editor, even though you created it with a form.
Ingesting Data
Now that our Connections are established, we're ready to bring the data into our Data Plane for analysis. While Connections enable access to external systems, they don't automatically import data. To ingest, transform, and export data, we need to create Components within our pipeline.
In Ascend, Components are the fundamental building blocks of data pipelines that define the sequence and logic of data flow. Depending on their type, Components are configured using either YAML, Python, or SQL files. These Components handle data ingestion, transformation, and output throughout the Ascend platform.
To learn more about Components and their types, visit: https://docs.ascend.io/concepts/component
Let's set up our flow structure to organize our Components.