Skip to main content

Partitioned Write Strategy

Container for specifying the partitioned write strategy.

PartitionedWriteStrategy

info

PartitionedWriteStrategy is defined beneath the following ancestor nodes in the YAML structure:

Below are the properties for the PartitionedWriteStrategy. Each property links to the specific details section further down in this page.

PropertyDefaultTypeRequiredDescription
partitionedYesOptions to use when writing partitioned data to a Write Component.

Property Details

Component

A Component is a fundamental building block of a data Flow. Supported Component types include: Read, Transform, Task, Test, and more.

PropertyDefaultTypeRequiredDescription
componentOne of:
  CustomPythonReadComponent
  ApplicationComponent
  AliasedTableComponent
  ExternalTableComponent
  FivetranComponent
YesComponent configuration options.

WriteComponent

PropertyDefaultTypeRequiredDescription
skipboolean
NoBoolean flag indicating whether to skip processing for the Component or not.
retry_strategyNoRetry strategy configuration options for the Component if any exceptions are encountered.
descriptionstring
NoBrief description of what the model does.
metadataNoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
namestringYesThe name of the model
flow_namestring
NoName of the Flow that the Component belongs to.
writeOne of:
  BigQueryWriteComponent
  SnowflakeWriteComponent
  S3WriteComponent
  SFTPWriteComponent
  GcsWriteComponent
  AbfsWriteComponent
  MySQLWriteComponent
  OracleWriteComponent
  PostgresWriteComponent
Yes

AbfsWriteComponent

Component for writing files to an ABFS container.

PropertyDefaultTypeRequiredDescription
dependenciesarray[None]
NoList of dependencies that must complete before this Component runs.
connectionstringYesName of the Connection to use for writing data.
inputYesInput component name.
normalizeboolean
NoBoolean flag indicating if the output column names should be normalized to a standard naming convention when writing.
preserve_caseboolean
NoBoolean flag indicating if the case of the column names should be preserved when writing.
uppercaseboolean
NoBoolean flag indicating if the column names should be transformed to uppercase when writing.
strategyfull:
  mode: drop_and_recreate
Any of:
  snapshot
  FullWriteStrategy
  PartitionedWriteStrategy
NoOptions to use when writing data to file-based Components. When using the snapshot strategy without a name, the Flow run id is used by default as the snapshot name.
read_record_chunk_size100000integer
NoNumber of rows to read from the source. If not set, defaults to 100,000 rows.
target_file_size104857600integer
NoTarget size in bytes of the file to write. If not set, defaults to 100 * (2**20) bytes (100MB).
target_records_per_fileinteger
NoMax number of rows to write to each part file. If not set, will only use the target file size to determine the number of rows to write to each part file. This setting only applies when writing files in partitions. For the snapshot write strategy, it is only used if the path ends with a '/'. For the partitioned write strategy, this setting is always applied.
abfsYes

GcsWriteComponent

Component for writing files to a GCS bucket.

PropertyDefaultTypeRequiredDescription
dependenciesarray[None]
NoList of dependencies that must complete before this Component runs.
connectionstringYesName of the Connection to use for writing data.
inputYesInput component name.
normalizeboolean
NoBoolean flag indicating if the output column names should be normalized to a standard naming convention when writing.
preserve_caseboolean
NoBoolean flag indicating if the case of the column names should be preserved when writing.
uppercaseboolean
NoBoolean flag indicating if the column names should be transformed to uppercase when writing.
strategyfull:
  mode: drop_and_recreate
Any of:
  snapshot
  FullWriteStrategy
  PartitionedWriteStrategy
NoOptions to use when writing data to file-based Components. When using the snapshot strategy without a name, the Flow run id is used by default as the snapshot name.
read_record_chunk_size100000integer
NoNumber of rows to read from the source. If not set, defaults to 100,000 rows.
target_file_size104857600integer
NoTarget size in bytes of the file to write. If not set, defaults to 100 * (2**20) bytes (100MB).
target_records_per_fileinteger
NoMax number of rows to write to each part file. If not set, will only use the target file size to determine the number of rows to write to each part file. This setting only applies when writing files in partitions. For the snapshot write strategy, it is only used if the path ends with a '/'. For the partitioned write strategy, this setting is always applied.
gcsYes

S3WriteComponent

Component for writing files to an S3 bucket.

PropertyDefaultTypeRequiredDescription
dependenciesarray[None]
NoList of dependencies that must complete before this Component runs.
connectionstringYesName of the Connection to use for writing data.
inputYesInput component name.
normalizeboolean
NoBoolean flag indicating if the output column names should be normalized to a standard naming convention when writing.
preserve_caseboolean
NoBoolean flag indicating if the case of the column names should be preserved when writing.
uppercaseboolean
NoBoolean flag indicating if the column names should be transformed to uppercase when writing.
strategyfull:
  mode: drop_and_recreate
Any of:
  snapshot
  FullWriteStrategy
  PartitionedWriteStrategy
NoOptions to use when writing data to file-based Components. When using the snapshot strategy without a name, the Flow run id is used by default as the snapshot name.
read_record_chunk_size100000integer
NoNumber of rows to read from the source. If not set, defaults to 100,000 rows.
target_file_size104857600integer
NoTarget size in bytes of the file to write. If not set, defaults to 100 * (2**20) bytes (100MB).
target_records_per_fileinteger
NoMax number of rows to write to each part file. If not set, will only use the target file size to determine the number of rows to write to each part file. This setting only applies when writing files in partitions. For the snapshot write strategy, it is only used if the path ends with a '/'. For the partitioned write strategy, this setting is always applied.
s3Yes

SFTPWriteComponent

Component for writing files to an SFTP server.

PropertyDefaultTypeRequiredDescription
dependenciesarray[None]
NoList of dependencies that must complete before this Component runs.
connectionstringYesName of the Connection to use for writing data.
inputYesInput component name.
normalizeboolean
NoBoolean flag indicating if the output column names should be normalized to a standard naming convention when writing.
preserve_caseboolean
NoBoolean flag indicating if the case of the column names should be preserved when writing.
uppercaseboolean
NoBoolean flag indicating if the column names should be transformed to uppercase when writing.
strategyfull:
  mode: drop_and_recreate
Any of:
  snapshot
  FullWriteStrategy
  PartitionedWriteStrategy
NoOptions to use when writing data to file-based Components. When using the snapshot strategy without a name, the Flow run id is used by default as the snapshot name.
read_record_chunk_size100000integer
NoNumber of rows to read from the source. If not set, defaults to 100,000 rows.
target_file_size104857600integer
NoTarget size in bytes of the file to write. If not set, defaults to 100 * (2**20) bytes (100MB).
target_records_per_fileinteger
NoMax number of rows to write to each part file. If not set, will only use the target file size to determine the number of rows to write to each part file. This setting only applies when writing files in partitions. For the snapshot write strategy, it is only used if the path ends with a '/'. For the partitioned write strategy, this setting is always applied.
sftpYes

PartitionedWriteStrategyOptions

Resource options for incremental writes, including mode selection and criteria for detecting deletions and unique records.

PropertyDefaultTypeRequiredDescription
modeYesSpecifies the mode to use when writing data in partitions: 'append' to append new or modified partitions, 'insert_overwrite' to insert new partitions and replace/overwrite modified partitions, and 'sync' to encompass both 'insert_overwrite' functionality and to delete partitions when deleted at the source.
partition_colstring
NoColumn name used for partitioning. Uses the internal Ascend partition identifier by default.

PartitionedWriteModeEnum

No properties defined.