Self-hosted DuckLake
Set up DuckDB in Ascend with your own DuckLake infrastructure. You manage the PostgreSQL instance and object storage while Ascend coordinates the processing.
Prerequisites​
- An Ascend Project
- An Ascend Workspace
- A PostgreSQL Connection to a PostgreSQL instance
- A Connection to your preferred cloud object storage platform
Overview​
In Ascend, DuckLake runs ephemeral DuckDB processing directly on Flow runners with all data and metadata stored remotely, giving you the benefits of in-process compute with reliable, centralized data and metadata management.
- PostgreSQL Connection → Stores metadata and schema information
- Object store Connection → Stores actual data files (Parquet format)
- DuckDB Connection with DuckLake catalog → Coordinates processing between metadata and data
This guide assumes your PostgreSQL and object store Connections are already configured. Name them consistently:
- PostgreSQL Connection:
data_plane_ducklake_metadata.yaml
(or similar) - Object store Connection:
data_plane_ducklake_data_gcs.yaml
(or similar; adjust suffix for your provider)
Configure Project parameters​
DuckDB and DuckLake require specific parameters to coordinate between your data and metadata Connections. Add these parameters to your Ascend Project file:
...
parameters:
data_planes:
ducklake:
data_connection_name: data_plane_ducklake_data_gcs
metadata_connection_name: data_plane_ducklake_metadata
Now that your Project parameters are configured, the next step is to create the DuckDB Connection that integrates a DuckLake catalog.
Create a DuckDB Connection
From your Workspace Super Graph view, follow these steps:
- Form
- Files panel
- Create a Connection by either:
- Clicking the + button next to Connections in the left Build panel
- Right-clicking in the Super Graph and selecting Create Connection
- Enter a descriptive name like
data_plane_duckdb
- Select DuckDB from the available options
- Fill in the required fields (and any optional fields as needed)
- Click Save at the bottom to create your Connection

- Open the Files panel in the top left
- Right-click the Connections directory and select New File

- Give your file a descriptive name like
data_plane_duckdb.yaml
For complete configuration options, see our Connection reference guide.
Your DuckDB Connection file should look like this:
connection:
duckdb:
ducklake:
data_connection_name: ${parameters.data_planes.ducklake.data_connection_name}
metadata_connection_name: ${parameters.data_planes.ducklake.metadata_connection_name}
This Connection configuration references the parameters you defined earlier and automatically coordinates between your PostgreSQL metadata store and object storage.
Verify your setup​
Test all three Connections to verify they're configured correctly:
- PostgreSQL Connection (
data_plane_ducklake_metadata.yaml
) - Object store Connection (e.g.
data_plane_ducklake_data_gcs.yaml
) - DuckLake catalog Connection (
data_plane_ducklake.yaml
)
Optimize performance with concurrency​
For optimal performance with DuckLake, configure concurrency settings to balance throughput with reliability.
🎉 Congratulations, you just set up a DuckDB Connection with a DuckLake catalog in Ascend!
Now that your DuckDB Connection is set up:
- 💡 Read data - Connect to your data sources
- 🔄 Transform data - Process and transform your data
- 📤 Write data - Output your transformed data