Test YAML
In this guide, you'll learn how to add data quality and validation tests to your YAML Components to ensure data integrity in your pipelines.
Prerequisites
- Ascend Flow
For a comprehensive overview of test types and when to use them, see our Tests concept guide.
Test behavior
Tests accept a severity parameter that can be set to error or warn.
error is the default severity, meaning that failed tests cause the entire Component to fail. To log warnings instead of failing, set severity: warn:
columns:
id:
- not_null:
severity: warn
Test format
YAML Components support column-level, Component-level, and schema tests through a dedicated tests section:
Column-level tests
Column-level tests validate individual columns in your dataset.
Basic validation
Check for null values, empty strings, and uniqueness:
tests:
columns:
user_id:
- not_null
- not_empty
- unique
Numeric range tests
Validate that numeric values fall within expected ranges:
tests:
columns:
age:
- in_range:
min: 0
max: 120
price:
- greater_than:
value: 0
- less_than_or_equal:
value: 1000000
Available comparison tests:
greater_than: Values strictly greater than thresholdless_than: Values strictly less than thresholdgreater_than_or_equal: Values greater than or equal to thresholdless_than_or_equal: Values less than or equal to thresholdin_range: Values within min/max range (inclusive)
Date range tests
Validate date columns fall within expected ranges:
tests:
columns:
created_at:
- date_in_range:
min: "2023-01-01"
max: "2024-12-31"
Set membership tests
Validate that values belong to an allowed set:
tests:
columns:
status:
- in_set:
values:
- pending
- approved
- rejected
country_code:
- in_set:
values: [US, CA, MX, UK]
String pattern tests
Validate string content:
tests:
columns:
email:
- substring_match:
substring: "@"
Statistical tests
Validate statistical properties of numeric columns:
tests:
columns:
temperature:
- mean_in_range:
min: 60
max: 80
- stddev_in_range:
min: 0
max: 15
Distinct count tests
Validate the number of distinct values:
tests:
columns:
category:
- count_distinct_equal:
count: 5
region:
- count_distinct_equal:
count: 4
group_by_columns:
- country
Component-level tests
Component-level tests validate the entire output of your Component.
Row count tests
Verify exact or bounded row counts:
tests:
component:
- count_equal:
count: 1000
- count_greater_than:
count: 0
- count_less_than:
count: 1000000
Available count tests:
count_equal: Exactly N rowscount_greater_than: More than N rowscount_less_than: Fewer than N rowscount_greater_than_or_equal: At least N rowscount_less_than_or_equal: At most N rows
Grouped count tests
Validate row counts within groups:
tests:
component:
- count_greater_than:
count: 10
group_by_columns:
- region
- product_category
Combination uniqueness tests
Validate that combinations of columns are unique:
tests:
component:
- combination_unique:
columns:
- order_id
- line_item_id
Schema tests
Schema tests validate the structure and data types of your Component output:
tests:
schema:
match: exact
columns:
id: int
name: string
price: double
created_at: timestamp
The match parameter controls validation behavior:
exact: All columns must match exactly (no extra columns allowed)ignore_missing: Only validates listed columns; extra columns are allowed
tests:
schema:
match: ignore_missing
columns:
id: int
name: string
Custom SQL tests
Create reusable custom tests by defining SQL test macros. Custom tests return rows that fail the validation.
Define a custom test
Create a SQL file with the test definition:
{% test valid_email(component, column) %}
SELECT *
FROM {{ component }}
WHERE {{ column }} NOT LIKE '%@%.%'
{% endtest %}
Use custom tests
Reference custom tests in your Component:
tests:
columns:
email:
- valid_email
Parameterized custom tests
Add parameters to your custom tests:
{% test value_in_list(component, column, allowed_values) %}
SELECT *
FROM {{ component }}
WHERE {{ column }} NOT IN ({{ allowed_values | join(', ') }})
{% endtest %}
tests:
columns:
status:
- value_in_list:
allowed_values:
- "'active'"
- "'inactive'"
Complete example
Here's a comprehensive example demonstrating multiple test types:
Test reference
Column tests
| Test | Description | Parameters |
|---|---|---|
not_null | No NULL values | None |
not_empty | No empty strings | None |
unique | All values unique | None |
in_range | Numeric values within range | min, max |
date_in_range | Dates within range | min, max |
in_set | Values in allowed set | values |
greater_than | Values greater than threshold | value |
less_than | Values less than threshold | value |
greater_than_or_equal | Values greater than or equal to threshold | value |
less_than_or_equal | Values less than or equal to threshold | value |
substring_match | Contains substring | substring |
mean_in_range | Mean within range | min, max |
stddev_in_range | Standard deviation within range | min, max |
count_distinct_equal | Distinct count equals | count, group_by_columns (optional) |
Component tests
| Test | Description | Parameters |
|---|---|---|
count_equal | Exact row count | count |
count_greater_than | Rows greater than threshold | count, group_by_columns (optional) |
count_less_than | Rows less than threshold | count, group_by_columns (optional) |
count_greater_than_or_equal | Rows greater than or equal to threshold | count, group_by_columns (optional) |
count_less_than_or_equal | Rows less than or equal to threshold | count, group_by_columns (optional) |
combination_unique | Column combination unique | columns |
Next steps
- Learn about SQL tests
- Explore Python tests
- Review the Tests concept guide