Partitioned Write Strategy

Container for specifying the partitioned write strategy.

PartitionedWriteStrategy

info

PartitionedWriteStrategy is defined beneath the following ancestor nodes in the YAML structure:

Component
WriteComponent
AbfsWriteComponent
GcsWriteComponent
S3WriteComponent
SFTPWriteComponent

Below are the properties for the PartitionedWriteStrategy. Each property links to the specific details section further down in this page.

Property	Default	Type	Required	Description
partitioned			Yes	Options to use when writing partitioned data to a Write Component.

Property Details

Component

A Component is a fundamental building block of a data Flow. Supported Component types include: Read, Transform, Task, Test, and more.

Property	Default	Type	Required	Description
component		One of: CustomPythonReadComponent ApplicationComponent AliasedTableComponent ExternalTableComponent	Yes	Component configuration options.

WriteComponent

Property	Type	Required	Description
skip	boolean	No	Boolean flag indicating whether to skip processing for the Component or not.
retry_strategy		No	Retry strategy configuration options for the Component if any exceptions are encountered.
data_maintenance		No	The data maintenance configuration options for the Component.
description	string	No	Brief description of what the model does.
metadata		No	Meta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
name	string	Yes	The name of the model
flow_name	string	No	Name of the Flow that the Component belongs to.
write	One of: BigQueryWriteComponent SnowflakeWriteComponent S3WriteComponent SFTPWriteComponent GcsWriteComponent AbfsWriteComponent MySQLWriteComponent OracleWriteComponent PostgresWriteComponent	Yes

AbfsWriteComponent

Component for writing files to an ABFS container.

Property	Default	Type	Required	Description
dependencies		array[None]	No	List of dependencies that must complete before this Component runs.
connection		string	Yes	Name of the Connection to use for writing data.
input			Yes	Input component name.
normalize		boolean	No	Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing.
preserve_case		boolean	No	Boolean flag indicating if the case of the column names should be preserved when writing.
uppercase		boolean	No	Boolean flag indicating if the column names should be transformed to uppercase when writing.
strategy	full: mode: drop_and_recreate	Any of: snapshot FullWriteStrategy PartitionedWriteStrategy	No	Options to use when writing data to file-based Components. When using the snapshot strategy without a name, the Flow run id is used by default as the snapshot name.
read_record_chunk_size	100000	integer	No	Number of rows to read from the source. If not set, defaults to 100,000 rows.
target_file_size	104857600	integer	No	Target size in bytes of the file to write. If not set, defaults to 100 * (2**20) bytes (100MB).
target_records_per_file		integer	No	Max number of rows to write to each part file. If not set, will only use the target file size to determine the number of rows to write to each part file. This setting only applies when writing files in partitions. For the snapshot write strategy, it is only used if the path ends with a '/'. For the partitioned write strategy, this setting is always applied.
abfs			Yes

GcsWriteComponent

Component for writing files to a GCS bucket.

Property	Default	Type	Required	Description
dependencies		array[None]	No	List of dependencies that must complete before this Component runs.
connection		string	Yes	Name of the Connection to use for writing data.
input			Yes	Input component name.
normalize		boolean	No	Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing.
preserve_case		boolean	No	Boolean flag indicating if the case of the column names should be preserved when writing.
uppercase		boolean	No	Boolean flag indicating if the column names should be transformed to uppercase when writing.
strategy	full: mode: drop_and_recreate	Any of: snapshot FullWriteStrategy PartitionedWriteStrategy	No	Options to use when writing data to file-based Components. When using the snapshot strategy without a name, the Flow run id is used by default as the snapshot name.
read_record_chunk_size	100000	integer	No	Number of rows to read from the source. If not set, defaults to 100,000 rows.
target_file_size	104857600	integer	No	Target size in bytes of the file to write. If not set, defaults to 100 * (2**20) bytes (100MB).
target_records_per_file		integer	No	Max number of rows to write to each part file. If not set, will only use the target file size to determine the number of rows to write to each part file. This setting only applies when writing files in partitions. For the snapshot write strategy, it is only used if the path ends with a '/'. For the partitioned write strategy, this setting is always applied.
gcs			Yes

S3WriteComponent

Component for writing files to an S3 bucket.

Property	Default	Type	Required	Description
dependencies		array[None]	No	List of dependencies that must complete before this Component runs.
connection		string	Yes	Name of the Connection to use for writing data.
input			Yes	Input component name.
normalize		boolean	No	Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing.
preserve_case		boolean	No	Boolean flag indicating if the case of the column names should be preserved when writing.
uppercase		boolean	No	Boolean flag indicating if the column names should be transformed to uppercase when writing.
strategy	full: mode: drop_and_recreate	Any of: snapshot FullWriteStrategy PartitionedWriteStrategy	No	Options to use when writing data to file-based Components. When using the snapshot strategy without a name, the Flow run id is used by default as the snapshot name.
read_record_chunk_size	100000	integer	No	Number of rows to read from the source. If not set, defaults to 100,000 rows.
target_file_size	104857600	integer	No	Target size in bytes of the file to write. If not set, defaults to 100 * (2**20) bytes (100MB).
target_records_per_file		integer	No	Max number of rows to write to each part file. If not set, will only use the target file size to determine the number of rows to write to each part file. This setting only applies when writing files in partitions. For the snapshot write strategy, it is only used if the path ends with a '/'. For the partitioned write strategy, this setting is always applied.
s3			Yes

SFTPWriteComponent

Component for writing files to an SFTP server.

Property	Default	Type	Required	Description
dependencies		array[None]	No	List of dependencies that must complete before this Component runs.
connection		string	Yes	Name of the Connection to use for writing data.
input			Yes	Input component name.
normalize		boolean	No	Boolean flag indicating if the output column names should be normalized to a standard naming convention when writing.
preserve_case		boolean	No	Boolean flag indicating if the case of the column names should be preserved when writing.
uppercase		boolean	No	Boolean flag indicating if the column names should be transformed to uppercase when writing.
strategy	full: mode: drop_and_recreate	Any of: snapshot FullWriteStrategy PartitionedWriteStrategy	No	Options to use when writing data to file-based Components. When using the snapshot strategy without a name, the Flow run id is used by default as the snapshot name.
read_record_chunk_size	100000	integer	No	Number of rows to read from the source. If not set, defaults to 100,000 rows.
target_file_size	104857600	integer	No	Target size in bytes of the file to write. If not set, defaults to 100 * (2**20) bytes (100MB).
target_records_per_file		integer	No	Max number of rows to write to each part file. If not set, will only use the target file size to determine the number of rows to write to each part file. This setting only applies when writing files in partitions. For the snapshot write strategy, it is only used if the path ends with a '/'. For the partitioned write strategy, this setting is always applied.
sftp			Yes