DuckDB Connection
Connection to a DuckDB database.
Examples
- duckdb_connection_simple.yaml
- duckdb_connection_multiuser.yaml
- duckdb_high_performance.yaml
connection:
duckdb:
path: /path/to/your/duckdb/file
connection:
name: UniqueDuckDBConnection
description: DuckDB connection optimized for multi-user environments with up to 10 concurrent queries.
duckdb:
max_concurrent_queries: 10
path: /path/to/your/duckdb/database/file.duckdb
connection:
name: HighPerformanceDuckDBConnection
description: DuckDB connection configured for high-performance with specific limits.
duckdb:
path: /path/to/your/duckdb/database/file.duckdb
memory_limit: 2048
max_query_length: 10000
max_concurrent_queries: 5
DuckDBConnection
info
DuckDBConnection
is defined beneath the following ancestor nodes in the YAML structure:
Below are the properties for the DuckDBConnection
. Each property links to the specific details section further down in this page.
Property | Default | Type | Required | Description |
---|---|---|---|---|
description | string | No | Brief description of what the model does. | |
metadata | No | Meta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources. | ||
name | string | Yes | The name of the model | |
duckdb | Yes |
Property Details
Connection
Data source/sink Connection.
Property | Default | Type | Required | Description |
---|---|---|---|---|
connection | One of: S3Connection GcsConnection AbfsConnection LocalFileConnection SnowflakeConnection BigQueryConnection MSSQLConnection MySQLConnection OracleConnection PostgresConnection HttpConnection DuckDBConnection SFTPConnection DatabricksConnection | Yes | Data system Connection. |
DuckDBConnectionOptions
DuckDB Connection options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
max_query_length | integer | No | Maximum combined query length permitted during query batching. | |
max_concurrent_queries | integer | No | Maximum number of concurrent queries permitted. | |
max_combined_sql_statements | integer | No | Maximum number of combined SQL statements permitted during query batching. | |
path | string | No | Path to the DuckDB database file. Use ':memory:' for in-memory databases. This setting is ignored when DuckLake configuration is present. | |
memory_limit | integer | No | Memory limit to use for the DuckDB Connection. | |
ducklake | No | Configuration for using DuckLake with this DuckDB Connection. | ||
schema | string | No | Schema to use for the DuckDB Connection. If left empty, the Flow name will be used as the schema name at runtime | |
init_sql | string | No | SQL to run when the DuckLake Connection is initialized. |
DuckLakeOptions
DuckLake configuration options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
metadata_connection_name | string | No | Name of the Ascend Connection for DuckLake metadata storage (Postgres Connections only). | |
data_connection_name | string | No | Name of the Ascend Connection for DuckLake data storage (Local File, GCS, S3, or ABFS Connections supported). | |
metadata_connection | Any of: InlinePostgresConnection ASCEND_MANAGED | No | Inline metadata Connection configuration (Postgres Connections only). | |
data_connection | Any of: One of: InlineLocalFileConnection InlineS3Connection InlineGcsConnection InlineAbfsConnection ASCEND_MANAGED | No | Inline data Connection configuration (Local File, GCS, S3, or ABFS Connections supported). | |
catalog | string | No | Name of the DuckLake catalog to use. If not provided, the catalog name will be inferred from the profile name. | |
metadata_schema | default | string | No | Schema name within the Postgres database to use for DuckLake metadata storage. If left as the default value of 'default', the schema name will be replaced with the catalog name at runtime. |
data_path | ascendlake/data | string | No | Path within the data Connection root where DuckLake data files will be stored. The catalog name will always be appended to the path provided. |
local_mode | False | boolean | No | If set to True, sets up the DuckLake Connection with local storage, bypassing the metadata_connection(_name), data_connection(_name), and max_concurrent_queries settings. This is useful for rapid testing and development. |
ducklake_max_retry_count | 100 | integer | No | The value to set for the 'ducklake_max_retry_count' DuckLake configuration setting. Defaults to 100. |
InlineAbfsConnection
Property | Default | Type | Required | Description |
---|---|---|---|---|
abfs | Yes | ABFS Connection options. |
InlineGcsConnection
Property | Default | Type | Required | Description |
---|---|---|---|---|
gcs | Yes | GCS Connection options. |
GcsConnectionOptions
Google Cloud Storage Connection options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
root | string | Yes | GCS URL for the root prefix. | |
key | string | No | GCP service account credentials to use for the GCS Connection. |
InlineLocalFileConnection
Property | Default | Type | Required | Description |
---|---|---|---|---|
local_file | Yes |
InlinePostgresConnection
Property | Default | Type | Required | Description |
---|---|---|---|---|
postgres | Yes |
InlineS3Connection
Property | Default | Type | Required | Description |
---|---|---|---|---|
s3 | Yes | S3 Connection options. |
LocalFileConnectionOptions
Local File Connection options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
root | string | Yes | Root directory for the Local File Connection. |
PostgresConnectionOptions
PostgreSQL Connection options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
user | string | Yes | User to use for the DB Connection. | |
password | string | No | Password to use for the DB Connection. | |
aws_iam_auth | AwsRoleBasedAuthentication | No | AWS IAM authentication to use for the DB Connection. | |
host | string | Yes | PostgreSQL host. | |
database | string | Yes | PostgreSQL database. | |
schema | public | string | No | PostgreSQL schema. |
port | integer | No | PostgreSQL port. |
AwsRoleBasedAuthentication
Property | Default | Type | Required | Description |
---|---|---|---|---|
aws_access_key_id | string | No | AWS access key ID to use for the DB Connection. If not provided, the default credential chain will be used. | |
aws_secret_access_key | string | No | AWS secret access key to use for the DB Connection. If not provided, the default credential chain will be used. | |
region | string | No | AWS region to use for the DB Connection. If not provided, the default credential chain will be used. | |
role_arn | string | Yes | AWS Role ARN to use for the DB Connection. Required for AWS IAM authentication. |
S3ConnectionOptions
Amazon S3 Connection options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
region | string | No | AWS region to connect to. | |
root | string | Yes | S3 URL for the root prefix. | |
aws_access_key_id | string | No | Access key ID for the S3 Connection. | |
aws_secret_access_key | string | No | Secret access key for the S3 Connection. | |
enable_default_credential_chain | False | boolean | No | If True, uses the default credential chain for S3 authentication if no explicit credentials are provided. |
role_arn | string | No | Role ARN to assume when reading from S3. |
AbfsConnectionOptions
Azure Blob File System Connection options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
account | string | No | Azure Blob File System account name to connect to. | |
root | string | Yes | abfs[s] URL for the root prefix. | |
shared_key | string | No | Azure Blob File System shared key to use for the ABFS Connection. | |
service_principal | No | Azure Blob File System service principal in JSON to use for the ABFS connection. The JSON should include a key named 'client_id' for the client ID, a key named 'client_secret' for the client secret, and a key named 'tenant_id' for the tenant ID. | ||
enable_default_credential | False | boolean | No | If True, enables use of the default credential for the ABFS connection if no other credentials are provided. |
AbfsServicePrincipal
Property | Default | Type | Required | Description |
---|---|---|---|---|
client_id | string | Yes | Client ID for the service principal. | |
client_secret | string | Yes | Client secret for the service principal. | |
tenant_id | string | Yes | Tenant ID for the service principal. |
ResourceMetadata
Meta information of a resource. In most cases, it doesn't affect the system behavior but may be helpful to analyze Project resources.
Property | Default | Type | Required | Description |
---|---|---|---|---|
source | No | The origin or source information for the resource. | ||
source_event_uuid | string | No | Event UUID associated with creation of this resource. |
ResourceLocation
The origin or source information for the resource.
Property | Default | Type | Required | Description |
---|---|---|---|---|
path | string | Yes | Path within repository files where the resource is defined. | |
first_line_number | integer | No | First line number within path file where the resource is defined. |