Concurrency configuration

Proper concurrency configuration is essential for maximizing performance while maintaining reliability, especially for the DuckDB and DuckLake Data Plane. Ascend provides concurrency controls at two levels:

Connection level: Controls database query execution and retry behavior
Flow level: Controls how many Components run in parallel within a Flow

This guide covers the key parameters and provides configuration recommendations for optimal performance.

note

Ascend does not enforce upper limits on the concurrency parameters described in this guide. Setting values that exceed your data warehouse's capabilities will result in errors from the warehouse itself.

Connection level

`max_concurrent_queries`

Controls the maximum number of database queries that can execute simultaneously against a single Connection.

This parameter affects database connection pool utilization, query throughput, and resource contention on the database. This setting should be coordinated with component_concurrency settings.

Database-specific behavior:

DuckDB: Forced to 1 in local mode to avoid concurrency issues
Other Data Planes: No hard maximum found in the code

`max_query_length`

Controls the maximum length of SQL queries in characters that can be executed against the database. User-configured values are subject to the Ascend-enforced limitations defined below as well as the respective Data Plane limitations:

BigQuery: Maximum of 1,000,000 characters. See the BigQuery documentation for additional context.
Databricks: Maximum of 16,777,216 characters (16 * 1024 * 1024). See the Databricks documentation for additional context.
Snowflake: No hard limit enforced on the Ascend side. See the Snowflake documentation for additional context.

`max_combined_sql_statements`

Controls the maximum number of individual SQL statements can be combined together using UNION ALL in a single batch when Ascend processes partitioned Components.

Data Plane defaults:

BigQuery: Default of 100
Databricks: Default of 250
Snowflake: Default of 250

`ducklake_max_retry_count`

Controls retry behavior for internal DuckLake operations, which is critical for concurrent workloads. Defaults to 100.

Retries are essential for concurrency because concurrent write operations can cause primary key conflicts in DuckLake's metadata catalog. These conflicts are expected behavior, not errors, and DuckLake automatically retries operations with updated primary key values.

If DuckLake exhausts all retries, the primary key violation error bubbles up to Ascend, resulting in a Flow run error.

Flow level

`component_concurrency`

This parameter is typically configured in a specific Profile and controls the maximum number of Components that can execute simultaneously within a single Flow.

This parameter affects query execution and concurrent Flow threads. Higher values increase parallelism but may overwhelm downstream systems, so balance them with Connection-level settings.

Recommended configuration

While optimal values depend on your specific data volume and resources, the following configuration in your Profile (e.g., workspace-otto.yaml or workspace-<your-name>.yaml) provides a reliable starting point that balances performance and stability:

workspace-otto.yaml
profile:
  defaults:
    ...
  parameters:
    duckdb:
      # Flow configuration
      component_concurrency: 16
      # DuckDB/DuckLake Connection configuration
      max_concurrent_queries: 5
      ducklake_max_retry_count: 100

Tuning

If your use case requires frequent concurrent Flow execution, consider configuring different Flows to write to different DuckLake catalogs, or increase the ducklake_max_retry_count up to 10,000.

Summary

Parameter	Level	Description	Default	Ascend-enforced maximum
`max_concurrent_queries`	Connection	Maximum number of simultaneous database queries per Connection	Most Data Planes: 100 Most DuckLake configurations: 5 DuckLake local mode: 1	DuckDB local mode: 1 Other DuckDB configurations and Data Planes: No hard limit enforced by Ascend
`max_query_length`	Connection	Maximum length of SQL queries in characters		BigQuery: 1,000,000 characters Databricks: 17,777,216 characters Snowflake: No hard limit enforced by Ascend
`max_combined_sql_statements`	Connection	Maximum number of individual SQL statements can be combined together using `UNION ALL` in a single batch	BigQuery: 100 Snowflake: 250 Databricks: 250 DuckLake: 1	No hard limit enforced by Ascend
`ducklake_max_retry_count`	Connection	Retry behavior for internal DuckLake operations, critical for concurrent workloads	100	No hard limit enforced by Ascend
`component_concurrency`	Flow	Maximum number of Components that can execute simultaneously within a single Flow	16	No hard limit enforced by Ascend

Next steps

🦆 Learn how to configure your DuckDB with DuckLake Connection with recommended concurrency settings

Connection level​

max_concurrent_queries​

max_query_length​

max_combined_sql_statements​

ducklake_max_retry_count​

Flow level​

component_concurrency​

Recommended configuration​

Tuning​

Summary​

Next steps​