Full Write Strategy
Container for specifying the full write strategy used in Write Components.
FullWriteStrategy
FullWriteStrategy
is defined beneath the following ancestor nodes in the YAML structure:
Below are the properties for the FullWriteStrategy
. Each property links to the specific details section further down in this page.
Property | Default | Type | Required | Description |
---|---|---|---|---|
full | Yes | Options for handling the output table during a full write operation to a Write Component. |
Property Details
Component
A Component is a fundamental building block of a data Flow. Supported Component types include: Read, Transform, Task, Test, and more.
Property | Default | Type | Required | Description |
---|---|---|---|---|
component | One of: CustomPythonReadComponent ApplicationComponent AliasedTableComponent ExternalTableComponent FivetranComponent | Yes | Component configuration options. |
WriteComponent
Property | Default | Type | Required | Description |
---|---|---|---|---|
skip | boolean | No | Boolean flag indicating whether to skip processing for the Component or not. | |
retry_strategy | No | Retry strategy configuration options for the Component if any exceptions are encountered. | ||
description | string | No | Brief description of what the model does. | |
metadata | No | Meta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources. | ||
name | string | Yes | The name of the model | |
flow_name | string | No | Name of the Flow that the Component belongs to. | |
write | One of: BigQueryWriteComponent SnowflakeWriteComponent S3WriteComponent SFTPWriteComponent GcsWriteComponent AbfsWriteComponent MySQLWriteComponent OracleWriteComponent PostgresWriteComponent | Yes |
AbfsWriteComponent
Component for writing files to an ABFS container.
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy PartitionedWriteStrategy | No | Options to use when writing data to file-based Components. When using the snapshot strategy without a name, the Flow run id is used by default as the snapshot name. |
read_record_chunk_size | 100000 | integer | No | Number of rows to read from the source. If not set, defaults to 100,000 rows. |
target_file_size | 104857600 | integer | No | Target size in bytes of the file to write. If not set, defaults to 100 * (2**20) bytes (100MB). |
target_records_per_file | integer | No | Max number of rows to write to each part file. If not set, will only use the target file size to determine the number of rows to write to each part file. This setting only applies when writing files in partitions. For the snapshot write strategy, it is only used if the path ends with a '/'. For the partitioned write strategy, this setting is always applied. | |
abfs | Yes |
BigQueryWriteComponent
Component that writes data to a BigQuery table.
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy IncrementalWriteStrategyWithSchemaChange PartitionedWriteStrategyWithSchemaChange | No | Resource for write strategy. |
pre_sql | Any of: string array[string] | No | SQL statements to execute before the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
post_sql | Any of: string array[string] | No | SQL statements to execute after the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
bigquery | Yes |
GcsWriteComponent
Component for writing files to a GCS bucket.
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy PartitionedWriteStrategy | No | Options to use when writing data to file-based Components. When using the snapshot strategy without a name, the Flow run id is used by default as the snapshot name. |
read_record_chunk_size | 100000 | integer | No | Number of rows to read from the source. If not set, defaults to 100,000 rows. |
target_file_size | 104857600 | integer | No | Target size in bytes of the file to write. If not set, defaults to 100 * (2**20) bytes (100MB). |
target_records_per_file | integer | No | Max number of rows to write to each part file. If not set, will only use the target file size to determine the number of rows to write to each part file. This setting only applies when writing files in partitions. For the snapshot write strategy, it is only used if the path ends with a '/'. For the partitioned write strategy, this setting is always applied. | |
gcs | Yes |
MySQLWriteComponent
Component that writes data to a MySQL table
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy IncrementalWriteStrategyWithSchemaChange PartitionedWriteStrategyWithSchemaChange | No | Resource for write strategy. |
pre_sql | Any of: string array[string] | No | SQL statements to execute before the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
post_sql | Any of: string array[string] | No | SQL statements to execute after the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
mysql | Yes |
OracleWriteComponent
Component that writes data to an Oracle table
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy IncrementalWriteStrategyWithSchemaChange PartitionedWriteStrategyWithSchemaChange | No | Resource for write strategy. |
pre_sql | Any of: string array[string] | No | SQL statements to execute before the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
post_sql | Any of: string array[string] | No | SQL statements to execute after the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
oracle | Yes |
PostgresWriteComponent
Component that writes data to a Postgres table
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy IncrementalWriteStrategyWithSchemaChange PartitionedWriteStrategyWithSchemaChange | No | Resource for write strategy. |
pre_sql | Any of: string array[string] | No | SQL statements to execute before the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
post_sql | Any of: string array[string] | No | SQL statements to execute after the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
postgres | Yes |
S3WriteComponent
Component for writing files to an S3 bucket.
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy PartitionedWriteStrategy | No | Options to use when writing data to file-based Components. When using the snapshot strategy without a name, the Flow run id is used by default as the snapshot name. |
read_record_chunk_size | 100000 | integer | No | Number of rows to read from the source. If not set, defaults to 100,000 rows. |
target_file_size | 104857600 | integer | No | Target size in bytes of the file to write. If not set, defaults to 100 * (2**20) bytes (100MB). |
target_records_per_file | integer | No | Max number of rows to write to each part file. If not set, will only use the target file size to determine the number of rows to write to each part file. This setting only applies when writing files in partitions. For the snapshot write strategy, it is only used if the path ends with a '/'. For the partitioned write strategy, this setting is always applied. | |
s3 | Yes |
SFTPWriteComponent
Component for writing files to an SFTP server.
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy PartitionedWriteStrategy | No | Options to use when writing data to file-based Components. When using the snapshot strategy without a name, the Flow run id is used by default as the snapshot name. |
read_record_chunk_size | 100000 | integer | No | Number of rows to read from the source. If not set, defaults to 100,000 rows. |
target_file_size | 104857600 | integer | No | Target size in bytes of the file to write. If not set, defaults to 100 * (2**20) bytes (100MB). |
target_records_per_file | integer | No | Max number of rows to write to each part file. If not set, will only use the target file size to determine the number of rows to write to each part file. This setting only applies when writing files in partitions. For the snapshot write strategy, it is only used if the path ends with a '/'. For the partitioned write strategy, this setting is always applied. | |
sftp | Yes |
SnowflakeWriteComponent
Component that writes data to a Snowflake table.
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy IncrementalWriteStrategyWithSchemaChange PartitionedWriteStrategyWithSchemaChange | No | Resource for write strategy. |
pre_sql | Any of: string array[string] | No | SQL statements to execute before the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
post_sql | Any of: string array[string] | No | SQL statements to execute after the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
snowflake | Yes |
FullWriteStrategyOptions
Resource options for full writes, including mode selection.
Property | Default | Type | Required | Description |
---|---|---|---|---|
mode | Yes | Strategy for handling the output table during a full write operation. 'drop_and_recreate' will drop the output table and recreate it. |
FullWriteModeEnum
No properties defined.