Create an Incremental MySQL Read Component
In this guide, we will learn how to read incremental data from MySQL and explore the various materialization options available for incremental reads.
Prerequisites
- An Ascend Workspace
- An Ascend project
- An existing Flow.
- An existing MySQL Connection
Step 1: Create a new Component.
- From your Ascend Instance homepage, navigate to your Workspace.
- Within your Workspace, click on the File icon on the left to open up the file tree.
- Locate the Ascend Project folder where you would like to create your MySQL Read Component.
- Within the project folder, navigate into the
flows
folder, and then into the folder for the flow where you would like to add your component. - Within the flow folder, right-click on the
components
tab and select New file. - Name the file with a
.yaml
extension, such asmysql-incremental-read.yaml
. - An empty file will open in the build panel. This is where you will define your component's YAML.
Step 2: Configuring your Read Component
- Add the following YAML code to your file, replacing the
connection
placeholder with the file name of the MySQL connection you are using.
component:
read:
connection: mysql_db
- Add the
mysql
key to specify the type of read component this is, and then add the table name you would like to read from. In this example, we are reading from theaircraft
table.
component:
read:
connection: mysql_db
mysql:
table:
name: aircraft
- Add the
replication
key to specify the type of replication you want to use. In this case we will be reading data incrementally using theupdated_at
column.
component:
read:
connection: mysql_db
mysql:
table:
name: aircraft
replication:
incremental:
column_name: updated_at
Your incremental column should be a timestamp or a datetime column that keeps track of when a record was last updated. Ascend will use this column to read only the data that was modified since the last refresh. The new data is then merged into the existing dataset based on the materialization mode set in the next step.
- Add the
materialization
key to specify the materialization strategy you want to use. We will usemerge
materialization here with the following options:
mode
: The mode in which you want to merge. There are two options:latest
: This will merge the data without retaining any history, so you will always see the latest state of your table. This is the mode we have chosen in our example.history
: This will merge in new and updated records while also retaining previous versions of your records. For reach version of a record, Ascend will store an_ASCEND_START_AT
and_ASCEND_END_AT
timestamp which tells you the time frame within which that version of the record was valid. This allows you to time travel with your data and view versions of the data set at certain points in time.
unique_key
: The column that uniquely identifies each record in the table. This is used to determine which records are new or updated. In our example, we are using theaircraft_id
to uniquely identify records.deletion_col
: Ascend will detect deletions from the source table via thedeletion_col
you specify. Thedeletion_col
should represent a soft-delete timestamp for when the record should be considered deleted.
component:
read:
connection: mysql_db
mysql:
table:
name: aircraft
replication:
incremental:
column_name: updated_at
materialization:
merge:
mode: latest
unique_key: aircraft_id
deletion_col: deleted_at
-
Once you have entered your connection details, click "Save" to save the file.
-
In the Build Panel, click on Current Build. In the Summary tab, you will see the message "File changes have been made since the last build." Click Build Project. After the build is complete, you should see Build successful in the bottom left status bar.
You have now created an incremental MySQL Read Component! Learn how you can run your flow to view the incremental data in action.