Concurrency configuration
Proper concurrency configuration is essential for maximizing performance while maintaining reliability, especially for the DuckDB and DuckLake Data Plane. Ascend provides concurrency controls at two levels:
- Connection level: Controls database query execution and retry behavior
- Flow level: Controls how many Components run in parallel within a Flow
This guide covers the key parameters and provides configuration recommendations for optimal performance.
Ascend does not enforce upper limits on the concurrency parameters described in this guide. Setting values that exceed your data warehouse's capabilities will result in errors from the warehouse itself.
Connection level​
max_concurrent_queries
​
Controls the maximum number of database queries that can execute simultaneously against a single Connection.
This parameter affects database connection pool utilization, query throughput, and resource contention on the database. This setting should be coordinated with component_concurrency
settings.
Database-specific behavior:
- DuckDB: Forced to 1 in local mode to avoid concurrency issues
- Other Data Planes: No hard maximum found in the code
max_query_length
​
Controls the maximum length of SQL queries in characters that can be executed against the database. User-configured values are subject to the Ascend-enforced limitations defined below as well as the respective Data Plane limitations:
- BigQuery: Maximum of 1,000,000 characters. See the BigQuery documentation for additional context.
- Databricks: Maximum of 16,777,216 characters (16 * 1024 * 1024). See the Databricks documentation for additional context.
- Snowflake: No hard limit enforced on the Ascend side. See the Snowflake documentation for additional context.
max_combined_sql_statements
​
Controls the maximum number of individual SQL statements can be combined together using UNION ALL
in a single batch when Ascend processes partitioned Components.
Data Plane defaults:
- BigQuery: Default of 100
- Databricks: Default of 250
- Snowflake: Default of 250
ducklake_max_retry_count
​
Controls retry behavior for internal DuckLake operations, which is critical for concurrent workloads. Defaults to 100.
Retries are essential for concurrency because concurrent write operations can cause primary key conflicts in DuckLake's metadata catalog. These conflicts are expected behavior, not errors, and DuckLake automatically retries operations with updated primary key values.
If DuckLake exhausts all retries, the primary key violation error bubbles up to Ascend, resulting in a Flow run error.
Flow level​
component_concurrency
​
This parameter is typically configured in a specific Profile and controls the maximum number of Components that can execute simultaneously within a single Flow.
This parameter affects query execution and concurrent Flow threads. Higher values increase parallelism but may overwhelm downstream systems, so balance them with Connection-level settings.
Recommended configuration​
While optimal values depend on your specific data volume and resources, the following configuration in your Profile (e.g., workspace-otto.yaml
or workspace-<your-name>.yaml
) provides a reliable starting point that balances performance and stability:
profile:
defaults:
...
parameters:
duckdb:
# Flow configuration
component_concurrency: 16
# DuckDB/DuckLake Connection configuration
max_concurrent_queries: 5
ducklake_max_retry_count: 100
Tuning​
If your use case requires frequent concurrent Flow execution, consider configuring different Flows to write to different DuckLake catalogs, or increase the ducklake_max_retry_count
up to 10,000.
Summary​
Parameter | Level | Description | Default | Ascend-enforced maximum |
---|---|---|---|---|
max_concurrent_queries | Connection | Maximum number of simultaneous database queries per Connection | Most Data Planes: 100 Most DuckLake configurations: 5 DuckLake local mode: 1 | DuckDB local mode: 1 Other DuckDB configurations and Data Planes: No hard limit enforced by Ascend |
max_query_length | Connection | Maximum length of SQL queries in characters | BigQuery: 1,000,000 characters Databricks: 17,777,216 characters Snowflake: No hard limit enforced by Ascend | |
max_combined_sql_statements | Connection | Maximum number of individual SQL statements can be combined together using UNION ALL in a single batch | BigQuery: 100 Snowflake: 250 Databricks: 250 | No hard limit enforced by Ascend |
ducklake_max_retry_count | Connection | Retry behavior for internal DuckLake operations, critical for concurrent workloads | 100 | No hard limit enforced by Ascend |
component_concurrency | Flow | Maximum number of Components that can execute simultaneously within a single Flow | 16 | No hard limit enforced by Ascend |
Next steps​
🦆 Learn how to configure your DuckDB with DuckLake Connection with recommended concurrency settings