Merge Strategy
Strategy that involves merging new data with existing data by updating existing records that match the unique key.
MergeStrategy
info
MergeStrategy is defined beneath the following ancestor nodes in the YAML structure:
- Component
- CustomPythonReadComponent
- CustomPythonReadOptions
- ReadComponent
- BigQueryReadComponent
- DatabricksReadComponent
- MSSQLReadComponent
- MySQLReadComponent
- OracleReadComponent
- PostgresReadComponent
- SnowflakeReadComponent
- IncrementalReadStrategy
- TransformComponent
- PySparkTransform
- PythonTransform
- SnowparkTransform
- SqlTransform
- IncrementalStrategy
- WriteComponent
- BigQueryWriteComponent
- MySQLWriteComponent
- OracleWriteComponent
- PostgresWriteComponent
- SnowflakeWriteComponent
- IncrementalWriteStrategyWithSchemaChange
Below are the properties for the MergeStrategy. Each property links to the specific details section further down in this page.
| Property | Default | Type | Required | Description |
|---|---|---|---|---|
| merge | No | Options for merge strategy. |
Property Details
Component
A Component is a fundamental building block of a data Flow. Supported Component types include: Read, Transform, Task, Test, and more.
| Property | Default | Type | Required | Description |
|---|---|---|---|---|
| component | One of: CustomPythonReadComponent ApplicationComponent AliasedTableComponent ExternalTableComponent DbtNodeComponent | Yes | Component configuration options. |
CustomPythonReadComponent
Component that reads data using user-defined custom Python code.
| Property | Default | Type | Required | Description |
|---|---|---|---|---|
| data_plane | One of: SnowflakeDataPlane BigQueryDataPlane DuckdbDataPlane DatabricksDataPlane | No | Data Plane-specific configuration options for Components. | |
| skip | boolean | No | Boolean flag indicating whether to skip processing for the Component or not. | |
| retry_strategy | No | Retry strategy configuration options for the Component if any exceptions are encountered. | ||
| data_maintenance | No | The data maintenance configuration options for the Component. | ||
| description | string | No | Brief description of what the model does. | |
| metadata | No | Meta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources. | ||
| name | string | Yes | The name of the model | |
| flow_name | string | No | Name of the Flow that the Component belongs to. | |
| tests | No | Defines tests to run on this Component's data. | ||
| custom_python_read | Yes |
CustomPythonReadOptions
Configuration options for the Custom Python Read Component.
| Property | Default | Type | Required | Description |
|---|---|---|---|---|
| dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
| event_time | string | No | Timestamp column in the Component output used to represent Event time. | |
| strategy | full | Any of: full IncrementalStrategy PartitionedStrategy | No | Ingest strategy. |
| python | Any of: | Yes | Python code to execute for ingesting data. |
ReadComponent
Component that reads data from a system.
| Property | Default | Type | Required | Description |
|---|---|---|---|---|
| data_plane | One of: SnowflakeDataPlane BigQueryDataPlane DuckdbDataPlane DatabricksDataPlane | No | Data Plane-specific configuration options for Components. | |
| skip | boolean | No | Boolean flag indicating whether to skip processing for the Component or not. | |
| retry_strategy | No | Retry strategy configuration options for the Component if any exceptions are encountered. | ||
| data_maintenance | No | The data maintenance configuration options for the Component. | ||
| description | string | No | Brief description of what the model does. | |
| metadata | No | Meta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources. | ||
| name | string | Yes | The name of the model | |
| flow_name | string | No | Name of the Flow that the Component belongs to. | |
| tests | No | Defines tests to run on this Component's data. | ||
| read | One of: GenericFileReadComponent LocalFileReadComponent SFTPReadComponent S3ReadComponent GcsReadComponent AbfsReadComponent HttpReadComponent MSSQLReadComponent MySQLReadComponent OracleReadComponent PostgresReadComponent SnowflakeReadComponent BigQueryReadComponent DatabricksReadComponent | Yes | Read component that reads data from a system. |