Skip to main content
Version: 3.0.0

S3 Write Component

Component for writing files to an S3 bucket.

Examples

component:
write:
connection: myS3Connection
input:
flow: my_flow
name: my_input_component
s3:
path: my/specific/path/

S3WriteComponent

info

S3WriteComponent is defined beneath the following ancestor nodes in the YAML structure:

Below are the properties for the S3WriteComponent. Each property links to the specific details section further down in this page.

PropertyDefaultTypeRequiredDescription
connectionstringYesThe name of the connection to use for writing data.
inputInputComponentYesInput component name.
normalizeboolean
NoA boolean flag indicating if the output column names should be normalized to a standard naming convention when writing.
preserve_caseboolean
NoA boolean flag indicating if the case of the column names should be preserved when writing.
uppercaseboolean
NoA boolean flag indicating if the column names should be transformed to uppercase when writing.
strategypartitioned:
  mode: sync
  partition_col: null
Any of:
  PartitionedWriteStrategy
  FullWriteStrategy
NoOptions to use when writing data to file-based components.
s3FileWriteOptionsBaseYes

Property Details

Component

A component is a fundamental building block of a data flow. Types of components that are supported include: read, transform, task, test, and more.

PropertyDefaultTypeRequiredDescription
componentOne of:
  ReadComponent
  TransformComponent
  TaskComponent
  SingularTestComponent
  CustomPythonReadComponent
  WriteComponent
  CompoundComponent
  AliasedTableComponent
  ExternalTableComponent
YesConfiguration options for the component.

WriteComponent

PropertyDefaultTypeRequiredDescription
namestring
NoThe name of the model
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
flow_namestring
NoThe name of the flow that the component belongs to.
skipboolean
NoA boolean flag indicating whether to skip processing for the component or not.
skip_for_time_series_runsboolean
NoA boolean flag indicating whether to skip processing for this component in time-series runs.
writeOne of:
  BigQueryWriteComponent
  SnowflakeWriteComponent
  S3WriteComponent
  MySQLWriteComponent
  OracleWriteComponent
Yes

FileWriteOptionsBase

Resource for formatting files and writing files to a specified path

PropertyDefaultTypeRequiredDescription
pathstringYesPath to the directory to write to. Path is relative to the connection's root directory, and cannot be an absolute path or traverse outside the root directory.
partition_templatestring
NoA template for partition names that contains variables in curly braces. Every file for a partition will be written to a subdirectory with a name derived from the template interpolated with the partition values.
formatterautoOne of:
  Any of:
    string ("auto")
    AutoFormatter
  Any of:
    string ("parquet")
    ParquetFormatter
  Any of:
    string ("csv")
    CsvFormatter
  Any of:
    string ("json")
    JsonFormatter
NoFormatter Resource for writing files.
manifestManifestOptions
NoOptions for writing a manifest file. If not set, no manifest file will be written.

AutoFormatter

PropertyDefaultTypeRequiredDescription
autoNoFormatterOptionsYes

CsvFormatter

PropertyDefaultTypeRequiredDescription
csvNoFormatterOptionsYes

JsonFormatter

PropertyDefaultTypeRequiredDescription
jsonNoFormatterOptionsNo

ManifestOptions

Options for writing a manifest file.

PropertyDefaultTypeRequiredDescription
namestringYesName of the manifest file.

ParquetFormatter

PropertyDefaultTypeRequiredDescription
parquetNoFormatterOptionsYes

NoFormatterOptions

No custom formatting options exist for this parser.

No properties defined.

PartitionedWriteStrategy

Container for specifying the partitioned write strategy.

PropertyDefaultTypeRequiredDescription
partitionedPartitionedWriteStrategyOptionsYesOptions to use when writing data in partitions to a Write component.

FullWriteStrategy

Container for specifying the incremental write strategy.

PropertyDefaultTypeRequiredDescription
fullFullWriteStrategyOptionsYesOptions to use when fully writing data to a Write component.

FullWriteStrategyOptions

Resource options for full writes, including mode selection.

PropertyDefaultTypeRequiredDescription
modeFullWriteModeEnumYesSpecifies the mode to use when fully writing data: 'drop_and_recreate' to drop the output table and recreate it.

FullWriteModeEnum

No properties defined.

InputComponent

Specification for input components, including how partitioning behaviors should be handled. This additional metadata is required when a component is used as an input to other components in a flow.

PropertyDefaultTypeRequiredDescription
flowstringYesName of the parent flow that the input component belongs to.
namestringYesThe input component name.
aliasstring
NoThe alias to use for the input component.
partition_specAny of:
  string ("full_reduction", "map")
  RepartitionSpec
NoThe type of partitioning to apply to the component's input data before processing the component's logic. Input partitioning is applied before the component's logic is executed.
wherestring
NoAn optional filter condition to apply to the input component's data.
partition_bindingAny of:
  string
  PartitionBinding
NoAn optional partition binding specification to apply to the component on a per-output-partition basis against other inputs' partitions.

PartitionBinding

PropertyDefaultTypeRequiredDescription
logical_operatorlogical_operatorstring ("AND", "OR")NoThe logical operator to use to combine the partition binding predicates provided
predicatespredicatesarray[string]NoThe list of partition binding predicates to apply to the input component's data

PartitionedWriteStrategyOptions

Resource options for incremental writes, including mode selection and criteria for detecting deletions and unique records.

PropertyDefaultTypeRequiredDescription
modePartitionedWriteModeEnumYesSpecifies the mode to use when writing data in partitions: 'append' to append new or modified partitions, 'insert_overwrite' to insert new partitions and replace/overwrite modified partitions, and 'sync' to encompass both 'insert_overwrite' functionality and to delete partitions when deleted at the source.
partition_colstring
NoColumn name used for partitioning, uses the internal Ascend partition identifier by default.

PartitionedWriteModeEnum

No properties defined.

RepartitionSpec

Specification for repartitioning operations on input component's data

PropertyDefaultTypeRequiredDescription
repartitionRepartitionOptions
NoOptions for repartitioning the input component's data.

RepartitionOptions

Options for repartitioning the input component's data.

PropertyDefaultTypeRequiredDescription
partition_bystringYesThe column to partition by.
granularitystringYesThe granularity to use for the partitioning.