Incremental Write Strategy with Schema Change
Incremental write strategy that defines how new data is merged with existing data, along with policy for handling schema changes.
IncrementalWriteStrategyWithSchemaChange
IncrementalWriteStrategyWithSchemaChange
is defined beneath the following ancestor nodes in the YAML structure:
Below are the properties for the IncrementalWriteStrategyWithSchemaChange
. Each property links to the specific details section further down in this page.
Property | Default | Type | Required | Description |
---|---|---|---|---|
incremental | Any of: append MergeStrategy | Yes | Options to use when incrementally writing data to a Write component. | |
incremental_column | string | Yes | Name of the column to use for tracking incremental updates to the data. | |
on_schema_change | string ("ignore", "fail", "append_new_columns", "sync_all_columns") | No | Policy to apply when schema changes are detected. Defaults to 'fail' if not provided. |
Property Details
Component
A Component is a fundamental building block of a data Flow. Supported Component types include: Read, Transform, Task, Test, and more.
Property | Default | Type | Required | Description |
---|---|---|---|---|
component | One of: CustomPythonReadComponent ApplicationComponent AliasedTableComponent ExternalTableComponent FivetranComponent | Yes | Component configuration options. |
WriteComponent
Property | Default | Type | Required | Description |
---|---|---|---|---|
skip | boolean | No | Boolean flag indicating whether to skip processing for the Component or not. | |
retry_strategy | No | Retry strategy configuration options for the Component if any exceptions are encountered. | ||
description | string | No | Brief description of what the model does. | |
metadata | No | Meta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources. | ||
name | string | Yes | The name of the model | |
flow_name | string | No | Name of the Flow that the Component belongs to. | |
write | One of: BigQueryWriteComponent SnowflakeWriteComponent S3WriteComponent SFTPWriteComponent GcsWriteComponent AbfsWriteComponent MySQLWriteComponent OracleWriteComponent PostgresWriteComponent | Yes |
BigQueryWriteComponent
Component that writes data to a BigQuery table.
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy IncrementalWriteStrategyWithSchemaChange PartitionedWriteStrategyWithSchemaChange | No | Resource for write strategy. |
pre_sql | Any of: string array[string] | No | SQL statements to execute before the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
post_sql | Any of: string array[string] | No | SQL statements to execute after the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
bigquery | Yes |
MySQLWriteComponent
Component that writes data to a MySQL table
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy IncrementalWriteStrategyWithSchemaChange PartitionedWriteStrategyWithSchemaChange | No | Resource for write strategy. |
pre_sql | Any of: string array[string] | No | SQL statements to execute before the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
post_sql | Any of: string array[string] | No | SQL statements to execute after the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
mysql | Yes |
OracleWriteComponent
Component that writes data to an Oracle table
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy IncrementalWriteStrategyWithSchemaChange PartitionedWriteStrategyWithSchemaChange | No | Resource for write strategy. |
pre_sql | Any of: string array[string] | No | SQL statements to execute before the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
post_sql | Any of: string array[string] | No | SQL statements to execute after the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
oracle | Yes |
PostgresWriteComponent
Component that writes data to a Postgres table
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy IncrementalWriteStrategyWithSchemaChange PartitionedWriteStrategyWithSchemaChange | No | Resource for write strategy. |
pre_sql | Any of: string array[string] | No | SQL statements to execute before the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
post_sql | Any of: string array[string] | No | SQL statements to execute after the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
postgres | Yes |
SnowflakeWriteComponent
Component that writes data to a Snowflake table.
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
connection | string | Yes | Name of the Connection to use for writing data. | |
input | Yes | Input component name. | ||
normalize | boolean | No | Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing. | |
preserve_case | boolean | No | Boolean flag indicating if the case of the column names should be preserved when writing. | |
uppercase | boolean | No | Boolean flag indicating if the column names should be transformed to uppercase when writing. | |
strategy | full: mode: drop_and_recreate | Any of: snapshot FullWriteStrategy IncrementalWriteStrategyWithSchemaChange PartitionedWriteStrategyWithSchemaChange | No | Resource for write strategy. |
pre_sql | Any of: string array[string] | No | SQL statements to execute before the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
post_sql | Any of: string array[string] | No | SQL statements to execute after the main write operation. Can be a single SQL statement string or multiple statements as a list of strings. | |
snowflake | Yes |
MergeStrategy
Strategy that involves merging new data with existing data by updating existing records that match the unique key.
Property | Default | Type | Required | Description |
---|---|---|---|---|
merge | No | Options for merge strategy. |
KeyOptions
Column options needed for merge and SCD Type 2 strategies, such as unique key and deletion column name.
Property | Default | Type | Required | Description |
---|---|---|---|---|
unique_key | string | Yes | Column or comma-separated set of columns used as a unique identifier for records, aiding in the merge process. | |
deletion_column | string | No | Column name used in the upstream source for soft-deleting records. Used when replicating data from a source that supports soft-deletion. If provided, the merge strategy will be able to detect deletions and mark them as deleted in the destination. If not provided, the merge strategy will not be able to detect deletions. | |
merge_update_columns | Any of: string array[string] | No | List of columns to include when updating values in merge. These columns are mutually exclusive with respect to the columns in merge_exclude_columns . | |
merge_exclude_columns | Any of: string array[string] | No | List of columns to exclude when updating values in merge. These columns are mutually exclusive with respect to the columns in merge_update_columns . | |
incremental_predicates | Any of: string array[string] | No | List of conditions to filter incremental data. |