DuckDB Connection
Connection to a DuckDB database.
Examples
- duckdb_connection_simple.yaml
- duckdb_connection_multiuser.yaml
- duckdb_high_performance.yaml
connection:
duckdb:
path: /path/to/your/duckdb/file
connection:
name: UniqueDuckDBConnection
description: DuckDB connection optimized for multi-user environments with up to 10 concurrent queries.
duckdb:
max_concurrent_queries: 10
path: /path/to/your/duckdb/database/file.duckdb
connection:
name: HighPerformanceDuckDBConnection
description: DuckDB connection configured for high-performance with specific limits.
duckdb:
path: /path/to/your/duckdb/database/file.duckdb
memory_limit: 2048
max_query_length: 10000
max_concurrent_queries: 5
DuckDBConnection
info
DuckDBConnection
is defined beneath the following ancestor nodes in the YAML structure:
Below are the properties for the DuckDBConnection
. Each property links to the specific details section further down in this page.
Property | Default | Type | Required | Description |
---|---|---|---|---|
description | string | No | A brief description of what the model does. | |
metadata | No | Meta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources. | ||
name | string | Yes | The name of the model | |
duckdb | Yes |
Property Details
Connection
Data source/sink Connection.
Property | Default | Type | Required | Description |
---|---|---|---|---|
connection | One of: S3Connection GcsConnection AbfsConnection LocalFileConnection SnowflakeConnection BigQueryConnection MSSQLConnection MySQLConnection OracleConnection PostgresConnection HttpConnection DuckDBConnection SFTPConnection DatabricksConnection | Yes | Data system Connection. |
DuckDBConnectionOptions
DuckDB Connection options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
max_query_length | integer | No | Maximum query length to allow. | |
max_concurrent_queries | integer | No | Maximum number of concurrent queries to allow. | |
max_combined_sql_statements | integer | No | Maximum number of combined SQL statements to allow. | |
path | string | No | The path to the DuckDB database file. Use ':memory:' for in-memory databases. This setting is ignored when DuckLake configuration is present. | |
memory_limit | integer | No | The memory limit to use for the DuckDB Connection. | |
ducklake | No | Configuration for using DuckLake with this DuckDB Connection. | ||
schema | string | No | Schema to use for the DuckDB Connection. If left empty, the Flow name will be used as the schema name at runtime | |
init_sql | string | No | SQL to run when the DuckLake Connection is initialized. |
DuckLakeOptions
DuckLake configuration options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
metadata_connection_name | string | No | Name of the Ascend Connection to use for DuckLake metadata storage (Postgres Connections only). | |
data_connection_name | string | No | Name of the Ascend Connection to use for DuckLake data storage (Local Files, GCS, S3, or ABFS Connections supported). | |
metadata_connection | Any of: InlinePostgresConnection ASCEND_MANAGED | No | Inline metadata Connection configuration (Postgres Connections only). | |
data_connection | Any of: One of: InlineLocalFileConnection InlineS3Connection InlineGcsConnection InlineAbfsConnection ASCEND_MANAGED | No | Inline data Connection configuration (Local Files, GCS, S3, or ABFS Connections supported). | |
catalog | string | No | The name of the DuckLake catalog to use. If not provided, the catalog name will be inferred from the profile name. | |
metadata_schema | default | string | No | The schema name within the Postgres database to use for DuckLake metadata storage. If left as the default value of 'default', the schema name will be replaced with the catalog name at runtime. |
data_path | ascendlake/data | string | No | Path within the data Connection root where DuckLake data files will be stored. The catalog name will always be appended to the path provided. |
local_mode | False | boolean | No | If set to True, sets up the DuckLake Connection with local storage, bypassing the metadata_connection(_name), data_connection(_name), and max_concurrent_queries settings. This is useful for rapid testing and development. |
ducklake_max_retry_count | 100 | integer | No | The value to set for the 'ducklake_max_retry_count' DuckLake configuration setting. Defaults to 100. |
InlineAbfsConnection
Property | Default | Type | Required | Description |
---|---|---|---|---|
abfs | Yes | ABFS Connection options. |
InlineGcsConnection
Property | Default | Type | Required | Description |
---|---|---|---|---|
gcs | Yes | GCS Connection options. |
GcsConnectionOptions
Google Cloud Storage Connection options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
root | string | Yes | GCS URL for the root prefix. | |
key | string | No | GCP service account credentials to use for the GCS Connection. |
InlineLocalFileConnection
Property | Default | Type | Required | Description |
---|---|---|---|---|
local_file | Yes | Local File Connection options. |
InlinePostgresConnection
Property | Default | Type | Required | Description |
---|---|---|---|---|
postgres | Yes |
InlineS3Connection
Property | Default | Type | Required | Description |
---|---|---|---|---|
s3 | Yes | S3 Connection options. |
LocalFileConnectionOptions
Local file Connection options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
root | string | Yes | Root directory for the Local File Connection. |
PostgresConnectionOptions
PostgreSQL Connection options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
host | string | Yes | PostgreSQL host to connect to. | |
user | string | Yes | PostgreSQL user to connect as. | |
password | string | Yes | PostgreSQL password to use for the Connection. | |
database | string | Yes | PostgreSQL database to connect to. | |
schema | public | string | No | PostgreSQL schema to use. |
port | integer | No | PostgreSQL port to connect to. |
S3ConnectionOptions
Amazon S3 Connection options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
region | string | No | The AWS region to connect to. | |
root | string | Yes | the s3 URL for the root prefix. | |
aws_access_key_id | string | No | Access key ID to use for the S3 Connection. | |
aws_secret_access_key | string | No | Secret access key to use for the S3 Connection. | |
enable_default_credential_chain | False | boolean | No | If True, enables use of the default credential chain for the S3 connection if no other credentials are provided. |
role_arn | string | No | Role ARN to assume when reading from S3. |
AbfsConnectionOptions
Azure Blob File System Connection options.
Property | Default | Type | Required | Description |
---|---|---|---|---|
account | string | No | Azure Blob File System account name to connect to. | |
root | string | Yes | abfs[s] URL for the root prefix. | |
shared_key | string | No | Azure Blob File System shared key to use for the ABFS Connection. | |
service_principal | No | Azure Blob File System service principal in JSON to use for the ABFS connection. The JSON should include a key named 'client_id' for the client ID, a key named 'client_secret' for the client secret, and a key named 'tenant_id' for the tenant ID. | ||
enable_default_credential | False | boolean | No | If True, enables use of the default credential for the ABFS connection if no other credentials are provided. |
AbfsServicePrincipal
Property | Default | Type | Required | Description |
---|---|---|---|---|
client_id | string | Yes | Client ID for the service principal. | |
client_secret | string | Yes | Client secret for the service principal. | |
tenant_id | string | Yes | Tenant ID for the service principal. |
ResourceMetadata
Meta information of a resource. In most cases, it doesn't affect the system behavior but may be helpful to analyze Project resources.
Property | Default | Type | Required | Description |
---|---|---|---|---|
source | No | The origin or source information for the resource. | ||
source_event_uuid | string | No | UUID of the event that is associated with creation of this resource. |
ResourceLocation
The origin or source information for the resource.
Property | Default | Type | Required | Description |
---|---|---|---|---|
path | string | Yes | Path within repository files where the resource is defined. | |
first_line_number | integer | No | First line number within path file where the resource is defined. |