Skip to main content

DuckDB Connection

Connection to a DuckDB database.

Examples

connection:
duckdb:
path: /path/to/your/duckdb/file

DuckDBConnection

info

DuckDBConnection is defined beneath the following ancestor nodes in the YAML structure:

Below are the properties for the DuckDBConnection. Each property links to the specific details section further down in this page.

PropertyDefaultTypeRequiredDescription
descriptionstring
NoA brief description of what the model does.
metadataNoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
namestringYesThe name of the model
duckdbYes

Property Details

Connection

Data source/sink Connection.

PropertyDefaultTypeRequiredDescription
connectionOne of:
  S3Connection
  GcsConnection
  AbfsConnection
  LocalFileConnection
  SnowflakeConnection
  BigQueryConnection
  MSSQLConnection
  MySQLConnection
  OracleConnection
  PostgresConnection
  HttpConnection
  DuckDBConnection
  SFTPConnection
  DatabricksConnection
YesData system Connection.

DuckDBConnectionOptions

DuckDB Connection options.

PropertyDefaultTypeRequiredDescription
max_query_lengthinteger
NoMaximum query length to allow.
max_concurrent_queriesinteger
NoMaximum number of concurrent queries to allow.
max_combined_sql_statementsinteger
NoMaximum number of combined SQL statements to allow.
pathstring
NoThe path to the DuckDB database file. Use ':memory:' for in-memory databases. This setting is ignored when DuckLake configuration is present.
memory_limitinteger
NoThe memory limit to use for the DuckDB Connection.
ducklakeNoConfiguration for using DuckLake with this DuckDB Connection.
schemastringNoSchema to use for the DuckDB Connection. If left empty, the Flow name will be used as the schema name at runtime
init_sqlstring
NoSQL to run when the DuckLake Connection is initialized.

DuckLakeOptions

DuckLake configuration options.

PropertyDefaultTypeRequiredDescription
metadata_connection_namestring
NoName of the Ascend Connection to use for DuckLake metadata storage (Postgres Connections only).
data_connection_namestring
NoName of the Ascend Connection to use for DuckLake data storage (Local Files, GCS, S3, or ABFS Connections supported).
metadata_connectionAny of:
  InlinePostgresConnection
  ASCEND_MANAGED
NoInline metadata Connection configuration (Postgres Connections only).
data_connectionAny of:
  One of:
    InlineLocalFileConnection
    InlineS3Connection
    InlineGcsConnection
    InlineAbfsConnection
  ASCEND_MANAGED
NoInline data Connection configuration (Local Files, GCS, S3, or ABFS Connections supported).
catalogstring
NoThe name of the DuckLake catalog to use. If not provided, the catalog name will be inferred from the profile name.
metadata_schemadefaultstringNoThe schema name within the Postgres database to use for DuckLake metadata storage. If left as the default value of 'default', the schema name will be replaced with the catalog name at runtime.
data_pathascendlake/datastringNoPath within the data Connection root where DuckLake data files will be stored. The catalog name will always be appended to the path provided.
local_modeFalsebooleanNoIf set to True, sets up the DuckLake Connection with local storage, bypassing the metadata_connection(_name), data_connection(_name), and max_concurrent_queries settings. This is useful for rapid testing and development.
ducklake_max_retry_count100integerNoThe value to set for the 'ducklake_max_retry_count' DuckLake configuration setting. Defaults to 100.

InlineAbfsConnection

PropertyDefaultTypeRequiredDescription
abfsYesABFS Connection options.

InlineGcsConnection

PropertyDefaultTypeRequiredDescription
gcsYesGCS Connection options.

GcsConnectionOptions

Google Cloud Storage Connection options.

PropertyDefaultTypeRequiredDescription
rootstringYesGCS URL for the root prefix.
keystring
NoGCP service account credentials to use for the GCS Connection.

InlineLocalFileConnection

PropertyDefaultTypeRequiredDescription
local_fileYesLocal File Connection options.

InlinePostgresConnection

PropertyDefaultTypeRequiredDescription
postgresYes

InlineS3Connection

PropertyDefaultTypeRequiredDescription
s3YesS3 Connection options.

LocalFileConnectionOptions

Local file Connection options.

PropertyDefaultTypeRequiredDescription
rootstringYesRoot directory for the Local File Connection.

PostgresConnectionOptions

PostgreSQL Connection options.

PropertyDefaultTypeRequiredDescription
hoststringYesPostgreSQL host to connect to.
userstringYesPostgreSQL user to connect as.
passwordstringYesPostgreSQL password to use for the Connection.
databasestringYesPostgreSQL database to connect to.
schemapublicstring
NoPostgreSQL schema to use.
portinteger
NoPostgreSQL port to connect to.

S3ConnectionOptions

Amazon S3 Connection options.

PropertyDefaultTypeRequiredDescription
regionstring
NoThe AWS region to connect to.
rootstringYesthe s3 URL for the root prefix.
aws_access_key_idstring
NoAccess key ID to use for the S3 Connection.
aws_secret_access_keystring
NoSecret access key to use for the S3 Connection.
enable_default_credential_chainFalsebooleanNoIf True, enables use of the default credential chain for the S3 connection if no other credentials are provided.
role_arnstring
NoRole ARN to assume when reading from S3.

AbfsConnectionOptions

Azure Blob File System Connection options.

PropertyDefaultTypeRequiredDescription
accountstring
NoAzure Blob File System account name to connect to.
rootstringYesabfs[s] URL for the root prefix.
shared_keystring
NoAzure Blob File System shared key to use for the ABFS Connection.
service_principalNoAzure Blob File System service principal in JSON to use for the ABFS connection. The JSON should include a key named 'client_id' for the client ID, a key named 'client_secret' for the client secret, and a key named 'tenant_id' for the tenant ID.
enable_default_credentialFalsebooleanNoIf True, enables use of the default credential for the ABFS connection if no other credentials are provided.

AbfsServicePrincipal

PropertyDefaultTypeRequiredDescription
client_idstringYesClient ID for the service principal.
client_secretstringYesClient secret for the service principal.
tenant_idstringYesTenant ID for the service principal.

ResourceMetadata

Meta information of a resource. In most cases, it doesn't affect the system behavior but may be helpful to analyze Project resources.

PropertyDefaultTypeRequiredDescription
sourceNoThe origin or source information for the resource.
source_event_uuidstring
NoUUID of the event that is associated with creation of this resource.

ResourceLocation

The origin or source information for the resource.

PropertyDefaultTypeRequiredDescription
pathstringYesPath within repository files where the resource is defined.
first_line_numberinteger
NoFirst line number within path file where the resource is defined.