Skip to main content
Version: 3.0.0

External Table

A component that constructs and updates an External Table. Currently supported for Snowflake only.

Examples

component:
external_table:
location: '@my_namespace.my_ext_stage/path'
file_format: 'my_file_format'
pattern: '.*[.]csv'
auto_refresh: true
partitions:
- name: 'partition_column'
data_type: 'STRING'
expression: 'EXTRACT(YEAR FROM my_date_column)'
aws_sns_topic: 'arn:aws:sns:us-west-2:123456789012:my_sns_topic'
integration: 'my_integration'

ExternalTableComponent

info

ExternalTableComponent is defined beneath the following ancestor nodes in the YAML structure:

Below are the properties for the ExternalTableComponent. Each property links to the specific details section further down in this page.

PropertyDefaultTypeRequiredDescription
namestring
NoThe name of the model
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
flow_namestring
NoThe name of the flow that the component belongs to.
skipboolean
NoA boolean flag indicating whether to skip processing for the component or not.
data_maintenanceDataMaintenance
NoThe data maintenance configuration options for the component.
testsComponentTestColumn
NoDefines tests to run on the data of this component.
external_tableAny of:
  SnowflakeExternalTableOptions
  BigQueryExternalTableOptions
YesConfiguration options for the External Table component.

Property Details

Component

A component is a fundamental building block of a data flow. Types of components that are supported include: read, transform, task, test, and more.

PropertyDefaultTypeRequiredDescription
componentOne of:
  ReadComponent
  TransformComponent
  TaskComponent
  SingularTestComponent
  CustomPythonReadComponent
  WriteComponent
  CompoundComponent
  AliasedTableComponent
  ExternalTableComponent
YesConfiguration options for the component.

BigQueryExternalTableOptions

Configuration options for the an External Table component in BigQuery. Currently not implemented, just stubbed for future reference.

PropertyDefaultTypeRequiredDescription
locationstring
No

SnowflakeExternalTableOptions

Configuration options for the an External Table component in Snowflake.

PropertyDefaultTypeRequiredDescription
locationstring
NoSnowflake Stage containing the data files, in the format @[namespace.]ext_stage_name[/path].
file_formatstring
NoThe file format configuration. See the Snowflake documentation at https://docs.snowflake.com/en/sql-reference/sql/create-external-table#required-parameters for more information.
patternstring
NoA regex pattern to match files in the stage.
auto_refreshboolean
NoDetermines if Snowflake should should auto-refresh the table.
partitionsarray[SnowflakeVirtualColumnSpec]
NoList of virtual columns to compute and partition the table by.
columnsarray[SnowflakeVirtualColumnSpec]
NoList of virtual columns to compute.
aws_sns_topicstring
NoSpecifies the Amazon Resource Name (ARN) for the SNS topic for your S3 bucket (required for auto-refreshing tables from S3 using SNS).
integrationstring
NoSpecifies the name of the notification integration (required for auto-refreshing tables from GCS or Azure Blob Store).

SnowflakeVirtualColumnSpec

Configuration options for the custom Python read component.

PropertyDefaultTypeRequiredDescription
namestringYesName of the virtual column.
data_typestringYesData type of the virtual column.
descriptionstring
NoA description of the virtual column.
expressionstring
NoThe SQL expression that computes the value for the virtual column.

ComponentTestColumn

PropertyDefaultTypeRequiredDescription
columnsobject
NoList of tests to run on columns the data after processing for validation purposes. Used in the context of a component.
componentarray[One of: (CombinationUniqueTest, InRangeTest, DateInRangeTest, InSetTest, SubstringMatchTest, CountDistinctEqualTest, CountGreaterThanOrEqualTest, CountGreaterThanTest, CountLessThanOrEqualTest, CountLessThanTest, CountEqualTest, GreaterThanTest, LessThanTest, GreaterThanOrEqualTest, LessThanOrEqualTest, MeanInRangeTest, StddevInRangeTest, ColumnTestSql, ColumnTestPython)]
NoList of component level tests.
schemaComponentSchemaTest
NoList of the component's schema level tests.

ColumnTestPython

Test to validate data using a Python function for a single column.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
namestringYes
pythonColumnTestPythonOptionsYesConfiguration options for the Python column test.

ColumnTestPythonOptions

PropertyDefaultTypeRequiredDescription
entrypointstringYesThe entrypoint for the python transform function.
sourcestringYesThe source file for the python transform function.
paramsobject
NoParameters for the Python test function.
is_asset_testboolean
No

ColumnTestSql

Test to validate data using an SQL query for a single column.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
namestringYes
sqlstring
NoSQL query that tests data for conditions.

CombinationUniqueTest

Test to check if a value is unique.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
combination_uniqueCombinationUniqueTestOptionsYesTest to check if a value is unique.

CombinationUniqueTestOptions

Configuration options for the unique test.

PropertyDefaultTypeRequiredDescription
columnsarray[string]YesThe combination of columns to check for uniqueness.

ComponentSchemaTest

Test to validate that component columns match expected types.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
matchexactstring ("exact", "ignore_missing")NoThe type of schema matching to perform. 'exact' requires all columns to be present, 'ignore_missing' allows for missing columns.
columnsobject
NoA mapping of column names to their expected types.

CountDistinctEqualTest

Test to check if the number of distinct values is equal to a certain number.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
count_distinct_equalCountDistinctEqualTestOptionsYesConfiguration options for the count_distinct_equal test.

CountDistinctEqualTestOptions

Configuration options for the count_distinct_equal test.

PropertyDefaultTypeRequiredDescription
countintegerYesThe number of distinct values to expect.
group_by_columnsarray[string]
NoThe columns to group by.

CountEqualTest

Test to check if the number of rows is equal to a certain number.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
count_equalCountEqualTestOptionsYesConfiguration options for the the count_equal test.

CountEqualTestOptions

Configuration options for the count_equal test.

PropertyDefaultTypeRequiredDescription
countintegerYesThe number of rows to expect.

CountGreaterThanOrEqualTest

Test to check if the number of rows is greater than or equal to a certain number.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
count_greater_than_or_equalCountGreaterThanOrEqualTestOptionsYesConfiguration options for the count_greater_than_or_equal test.

CountGreaterThanOrEqualTestOptions

Configuration options for the count_greater_than_or_equal test.

PropertyDefaultTypeRequiredDescription
countintegerYesThe value to compare against.
group_by_columnsarray[string]
NoThe columns to group by.

CountGreaterThanTest

Test to check if the number of rows is greater than a certain number.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
count_greater_thanCountGreaterThanTestOptionsYesConfiguration options for the count_greater_than test.

CountGreaterThanTestOptions

Configuration options for the count_greater_than test.

PropertyDefaultTypeRequiredDescription
countintegerYesThe value to compare against.
group_by_columnsarray[string]
NoThe columns to group by.

CountLessThanOrEqualTest

Test to check if the number of rows is greater than or equal to a certain number.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
count_less_than_or_equalCountLessThanOrEqualTestOptionsYesConfiguration options for the count_less_than_or_equal test.

CountLessThanOrEqualTestOptions

Configuration options for the count_less_than_or_equal test.

PropertyDefaultTypeRequiredDescription
countintegerYesThe value to compare against.
group_by_columnsarray[string]
NoThe columns to group by.

CountLessThanTest

Test to check if the number of rows is less than a certain number.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
count_less_thanCountLessThanTestOptionsYesConfiguration options for the count_less_than test.

CountLessThanTestOptions

Configuration options for the count_less_than test.

PropertyDefaultTypeRequiredDescription
countintegerYesThe value to compare against.
group_by_columnsarray[string]
NoThe columns to group by.

DataMaintenance

Data maintenance configuration options for the component.

PropertyDefaultTypeRequiredDescription
enabledboolean
NoA boolean flag indicating whether data maintenance is enabled for the component.

DateInRangeTest

Test to check if a date is within a certain range.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
date_in_rangeDateInRangeTestOptionsYesConfiguration options for the date_in_range test.

DateInRangeTestOptions

Configuration options for the date_in_range test.

PropertyDefaultTypeRequiredDescription
minstringYesThe minimum value to expect.
maxstringYesThe maximum value to expect.

GreaterThanOrEqualTest

Test to check if a value is greater than or equal to a certain number.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
greater_than_or_equalGreaterThanOrEqualTestOptionsYesConfiguration options for the greater_than_or_equal test.

GreaterThanOrEqualTestOptions

Configuration options for the greater_than_or_equal test.

PropertyDefaultTypeRequiredDescription
valueAny of:
  integer
  number
  string
YesThe value to compare against.

GreaterThanTest

Test to check if a value is greater than a certain number.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
greater_thanGreaterThanTestOptionsYesConfiguration options for the greater_than test.

GreaterThanTestOptions

Configuration options for the greater_than test.

PropertyDefaultTypeRequiredDescription
valueAny of:
  integer
  number
  string
YesThe value to compare against.

InRangeTest

Test to check if a value is within a certain range.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
in_rangeInRangeTestOptionsYesConfiguration options for the in_range test.

InRangeTestOptions

Configuration options for the in_range test.

PropertyDefaultTypeRequiredDescription
minAny of:
  integer
  number
  string
YesThe minimum value to expect.
maxAny of:
  integer
  number
  string
YesThe maximum value to expect.

InSetTest

Test to check if a value is in a set of values.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
in_setInSetTestOptionsYesConfiguration options for the in_set test.

InSetTestOptions

Configuration options for the in_set test.

PropertyDefaultTypeRequiredDescription
valuesarray[Any of: (integer, number, string)]YesThe set of values to expect.

LessThanOrEqualTest

Test to check if a value is less than or equal to a certain number.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
less_than_or_equalLessThanOrEqualTestOptionsYesConfiguration options for the less_than_or_equal test.

LessThanOrEqualTestOptions

Configuration options for the less_than_or_equal test.

PropertyDefaultTypeRequiredDescription
valueAny of:
  integer
  number
  string
YesThe value to compare against.

LessThanTest

Test to check if a value is less than a certain number.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
less_thanLessThanTestOptionsYesConfiguration options for the less_than test.

LessThanTestOptions

Configuration options for the less_than test.

PropertyDefaultTypeRequiredDescription
valueAny of:
  integer
  number
  string
YesThe value to compare against.

MeanInRangeTest

Test to check if a value is within a certain mean range.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
mean_in_rangeMeanInRangeTestOptionsYesConfiguration options for the mean_in_range test.

MeanInRangeTestOptions

Configuration options for the mean_in_range test.

PropertyDefaultTypeRequiredDescription
minAny of:
  integer
  number
  string
YesThe minimum value to expect.
maxAny of:
  integer
  number
  string
YesThe maximum value to expect.

NotEmptyTest

Test to check if a value is not empty.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
not_emptyNoTestOptions
NoTest to check if a value is not empty.

NotNullTest

Test to check if a value is not null.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
not_nullNoTestOptions
NoTest to check if a value is not null.

StddevInRangeTest

Test to check if a value is within a certain standard deviation range.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
stddev_in_rangeStddevInRangeTestOptionsYesConfiguration options for the stddev_in_range test.

StddevInRangeTestOptions

Configuration options for the stddev_in_range test.

PropertyDefaultTypeRequiredDescription
minAny of:
  integer
  number
  string
YesThe minimum value to expect.
maxAny of:
  integer
  number
  string
YesThe maximum value to expect.

SubstringMatchTest

Test to check if a value contains a substring.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
substring_matchSubstringMatchTestOptionsYesConfiguration options for the substring_match test.

SubstringMatchTestOptions

Configuration options for the substring_match test.

PropertyDefaultTypeRequiredDescription
substringstringYesThe substring to search for.

UniqueTest

Test to check if a value is unique.

PropertyDefaultTypeRequiredDescription
severityerrorstring ("error", "warn")NoThe severity level for issues raised by the test. Default is 'error'. Use 'error' for critical issues that should interrupt flow processing. Use 'warn' for warnings/minor issues that should not interrupt flow processing.
uniqueNoTestOptions
NoTest to check if a value is unique.

NoTestOptions

Configuration options for tests that have no test body definition (not_null, unique, etc.).

No properties defined.

ResourceMetadata

Meta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.

PropertyDefaultTypeRequiredDescription
sourceResourceLocation
NoThe origin or source information for the resource.
source_event_uuidstring
NoUUID of the event that is associated with creation of this resource.

ResourceLocation

The origin or source information for the resource.

PropertyDefaultTypeRequiredDescription
pathstringYesPath within repository files where the resource is defined.
first_line_numberinteger
NoFirst line number within path file where the resource is defined.