Skip to main content

DuckDB Data Plane with DuckLake Catalog

This guide shows you how to set up DuckDB in Ascend using DuckLake, an open lakehouse format that stores data in Parquet files and manages all metadata in a SQL database.

preview feature

We're still working on this one! Expect changes and note that standard SLAs don't apply, so don't rely on this for production workloads yet. Refer to our release stages to learn more.

Prerequisites​

Overview​

In Ascend, DuckLake runs ephemeral DuckDB processing directly on Flow runners with all data and metadata stored remotely, giving you the benefits of in-process compute with reliable, centralized data and metadata management.

  1. PostgreSQL Connection → Stores metadata and schema information
  2. Object store Connection → Stores actual data files (Parquet format)
  3. DuckDB Connection with DuckLake catalog → Coordinates processing between metadata and data

This guide assumes your PostgreSQL and object store Connections are already configured. Name them consistently:

  • PostgreSQL Connection: data_plane_ducklake_metadata.yaml (or similar)
  • Object store Connection: data_plane_ducklake_data_gcs.yaml (or similar; adjust suffix for your provider)

Configure Project and Profile parameters​

DuckDB and DuckLake require specific parameters to coordinate between your data and metadata connections.

Project configuration​

Add these parameters to your Ascend Project file:

ascend_project.yaml
...
parameters:
data_planes:
ducklake:
data_connection_name: data_plane_ducklake_data_gcs
metadata_connection_name: data_plane_ducklake_metadata

Profile configuration​

Each user's Profile configuration should specify a dedicated metadata_schema:

profiles/workspace_otto.yaml
profile:
parameters:
ducklake:
$<: $parameters.data_planes.ducklake
metadata_schema: workspace_otto
...

Create the DuckDB Connection with a DuckLake catalog​

Now that your Project and Profile parameters are configured, create the DuckDB Connection that integrates a DuckLake catalog:

Create a DuckDB Connection

From your workspace Super Graph view, follow these steps:

  1. Create a Connection by either:
    • Clicking the + button next to Connections in the left Build panel
    • add-connection
    • Right-clicking in the Super Graph and selecting Create Connection
    • menu
  2. Enter a descriptive name like `data_plane_ducklake`
  3. Select DuckDB from the available options
  4. Fill in the required fields (and any optional fields as needed)
  5. Click Save at the bottom to create your Connection
  6. form

For complete configuration options, see our Connection reference guide.

Your DuckDB Connection file should look like this:

connections/data_plane_ducklake.yaml
connection:
duckdb:
ducklake:
data_connection_name: ${parameters.data_planes.ducklake.data_connection_name}
metadata_connection_name: ${parameters.data_planes.ducklake.metadata_connection_name}

This Connection configuration references the parameters you defined earlier and automatically coordinates between your PostgreSQL metadata store and object storage.

Verify your setup​

Test all three Connections to verify they're configured correctly:

  1. PostgreSQL Connection (data_plane_ducklake_metadata.yaml)
  2. Object store Connection (e.g. data_plane_ducklake_data_gcs.yaml)
  3. DuckLake catalog Connection (data_plane_ducklake.yaml)

🎉 Congratulations, you just set up a DuckDB Connection with a DuckLake catalog in Ascend!