Write to S3
This guide shows you how to create an AWS S3 Write Component.
Prerequisites
- S3 Connection with write permissions
- An Ascend Flow with a Component that contains data
Create a new Write Component
From your Workspace Super Graph view, follow these steps to create your Write Component:
- Form
- Files panel
- Double-click the Flow where you want to create your Component
- Right-click on any Component
- Hover over Create Downstream -> Write, and select your target Connection
- Complete the form with these details:
- Select your Flow
- Enter a descriptive Component Name like
write_mysql
- Open the Files panel in the top left corner
- Navigate to and select your desired Flow
- Right-click on the components directory and choose New file
- Name your file with a descriptive name like
write_mysql.yaml
and press enter
Configure your S3 Write Component
Follow these steps to configure your S3 Write Component:
- Set up your Connection
- Enter your S3 Connection name in the
connection
field
- Enter your S3 Connection name in the
- Define your data source
- Set
input
to the Component that contains your source data
- Set
- Configure the write destination
- Set up the
s3
write connector options - Specify your target table name, schema, and other required properties
- Set up the
- Choose a write strategy
Select the strategy that best fits your use case:
Strategy Description Best for full (default)
Replaces the entire target table during each Flow Run Reference tables, complete data refreshes partitioned
Updates only the partitions that have changed Time-series data, regional datasets, date-partitioned tables snapshot
Creates flexible output as a single file or multiple chunks Data exports, analytical datasets, flexible output formats
For detailed guidance on when to use each strategy, see the write strategies guide.
Examples
Choose the write strategy that best fits your use case:
Full write is the default strategy used when no strategy is explicitly specified.
- Full write strategy
- Partitioned
- Snapshot (chunked)
- Snapshot (single file)
This example shows an S3 Write Component using a full write strategy that outputs data in chunks for optimal performance with large datasets.
component:
write:
connection: write_s3
input:
name: my_component
flow: my_flow
strategy:
full:
mode: drop_and_recreate
s3:
path: /some_other_dir/my_data.parquet
formatter: parquet
Output: Multiple files like part_001.parquet
, part_002.parquet
, etc. in the specified directory.
This example shows an S3 Write Component that uses a partitioned write strategy. Partitioned writes now produce chunked output with multiple files per partition.
component:
write:
connection: write_s3
input:
name: my_component
flow: my_flow
strategy:
partitioned:
mode: append
s3:
path: /some_parquet_dir
formatter: parquet
You can also override the default chunk size of 500K rows and configure a custom chunk size using the part_file_rows
field:
component:
write:
connection: write_s3
input:
name: my_component
flow: my_flow
s3:
path: /some_other_dir/
formatter: json
part_file_rows: 1000
Output: Multiple files like data_001.parquet
, data_002.parquet
, etc. in each partition directory.
This example shows an S3 Write Component using snapshot strategy with chunked output and custom chunk size of 1,000 rows per chunk using part_file_rows
.
The path ends with a trailing slash (/
), producing multiple chunk files.
component:
write:
connection: write_s3
input:
name: my_component
flow: my_flow
strategy: snapshot
s3:
path: /snapshot_data/
formatter: parquet
Output: Multiple files like part_001.parquet
, part_002.parquet
, etc. in the /snapshot_data/
directory.
This example shows an S3 Write Component using snapshot strategy with single file output. The path ends with a specific filename and extension.
component:
write:
connection: write_s3
input:
name: my_component
flow: my_flow
strategy: snapshot
s3:
path: /snapshot_data/my_snapshot.parquet
formatter: parquet
Output: A single file named my_snapshot.parquet
in the /snapshot_data/
directory.
🎉 Congratulations! You successfully created an S3 Write Component in Ascend.