Write to S3
This guide shows you how to create an AWS S3 Write Component.
Prerequisites​
- S3 Connection with write permissions
- An Ascend Flow with a Component that contains data
Create a new Write Component​
From your workspace Super Graph view, follow these steps to create your Write Component:
- Form
- Files panel
- Double-click the Flow where you want to create your Component
- Right-click on any Component
- Hover over Create Downstream -> Write, and select your target Connection
- Complete the form with these details:
- Select your Flow
- Enter a descriptive Component Name like
write_mysql
- Open the Files panel in the top left corner
- Navigate to and select your desired Flow
- Right-click on the components directory and choose New file
- Name your file with a descriptive name like
write_mysql.yaml
and press enter
Configure your S3 Write Component
Follow these steps to configure your S3 Write Component:
- Set up your Connection
- Enter your S3 Connection name in the
connection
field
- Enter your S3 Connection name in the
- Define your data source
- Set
input
to the Component that contains your source data
- Set
- Configure the write destination
- Set up the
s3
write connector options - Specify your target table name, schema, and other required properties
- Set up the
- Choose a write strategy
Select the strategy that best fits your use case:
Strategy Description Best for full (default)
Replaces the entire target table during each Flow Run Reference tables, complete data refreshes partitioned
Updates only the partitions that have changed Time-series data, regional datasets, date-partitioned tables snapshot
Creates flexible output as a single file or multiple chunks Data exports, analytical datasets, flexible output formats
For detailed guidance on when to use each strategy, see the write strategies guide.
Examples​
Choose the write strategy that best fits your use case:
Full write is the default strategy used when no strategy is explicitly specified.
- Full write strategy
- Partitioned
- Snapshot (chunked)
- Snapshot (single file)
This example shows an S3 Write Component using a full write strategy that outputs data in chunks for optimal performance with large datasets.
component:
write:
connection: write_s3
input:
name: my_component
flow: my_flow
strategy:
full:
mode: drop_and_recreate
s3:
path: /some_other_dir/my_data.parquet
formatter: parquet
Output: Multiple files like part_001.parquet
, part_002.parquet
, etc. in the specified directory.
This example shows an S3 Write Component that uses a partitioned write strategy. Partitioned writes now produce chunked output with multiple files per partition.
component:
write:
connection: write_s3
input:
name: my_component
flow: my_flow
strategy:
partitioned:
mode: append
s3:
path: /some_parquet_dir
formatter: parquet
You can also override the default chunk size of 500K rows and configure a custom chunk size using the part_file_rows
field:
component:
write:
connection: write_s3
input:
name: my_component
flow: my_flow
s3:
path: /some_other_dir/
formatter: json
part_file_rows: 1000
Output: Multiple files like data_001.parquet
, data_002.parquet
, etc. in each partition directory.
This example shows an S3 Write Component using snapshot strategy with chunked output and custom chunk size of 1,000 rows per chunk using part_file_rows
.
The path ends with a trailing slash (/
), producing multiple chunk files.
component:
write:
connection: write_s3
input:
name: my_component
flow: my_flow
strategy: snapshot
s3:
path: /snapshot_data/
formatter: parquet
Output: Multiple files like part_001.parquet
, part_002.parquet
, etc. in the /snapshot_data/
directory.
This example shows an S3 Write Component using snapshot strategy with single file output. The path ends with a specific filename and extension.
component:
write:
connection: write_s3
input:
name: my_component
flow: my_flow
strategy: snapshot
s3:
path: /snapshot_data/my_snapshot.parquet
formatter: parquet
Output: A single file named my_snapshot.parquet
in the /snapshot_data/
directory.
🎉 Congratulations! You successfully created an S3 Write Component in Ascend.