Skip to main content
Version: 3.0.0

BigQuery Data Plane Configuration

BigQueryDataPlane

Below are the properties for the BigQueryDataPlane. Each property links to the specific details section further down in this page.

PropertyDefaultTypeRequiredDescription
bigqueryBigQueryDataPlaneOptionsYesBigQuery configuration options.

Property Details

BackfillRun

Defines the parameters for a backfill run.

PropertyDefaultTypeRequiredDescription
backfill_runBackfillRunOptionsYesBackfill run options.

BackfillRunOptions

Options for a backfill run.

PropertyDefaultTypeRequiredDescription
namestring
NoThe name of the model
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
flow_namestringYesThe name of the flow that is to be backfilled.
start_timestringYesStart time of the time range to be backfilled.
end_timestringYesEnd time of the time range to be backfilled.
granularitystring ("day", "week", "month")YesThe time granularity to use for backfill. Must be one of: 'day', 'week', 'month'. The backfill runner will divide the date range into flow runs of this granularity and launch these flow runs.
max_concurrent_flow_runsinteger
NoThe maximum number of concurrent flow runs used for backfill. This is used to limit the number of flow runners (and hence cluster resources) that are launched at once.
backfill_orderstring ("forward_chronological", "reverse_chronological")
NoThe order to use for backfilling - either forward or reverse chronological order.
flow_run_optionsFlowRunBaseOptions
NoAdditional options for each flow run launched during the backfill.
run_final_syncboolean
NoA boolean flag indicating whether to run a final sync after the running concurrent backfill flow runs. This final sync is a single flow run that is executed without any time parameters, and is meant to sync the data to the latest state and capture any missing time intervals.

Component

A component is a fundamental building block of a data flow. Types of components that are supported include: read, transform, task, test, and more.

PropertyDefaultTypeRequiredDescription
componentOne of:
  ReadComponent
  TransformComponent
  TaskComponent
  SingularTestComponent
  CustomPythonReadComponent
  WriteComponent
  CompoundComponent
  AliasedTableComponent
  ExternalTableComponent
YesConfiguration options for the component.

CustomPythonReadComponent

A component that reads data using user-defined, custom Python code.

PropertyDefaultTypeRequiredDescription
data_plane  One of:
    SnowflakeDataPlane
    BigQueryDataPlane
    DuckdbDataPlane
    SynapseDataPlane
NoData Plane-specific configuration options for a component.
namestring
NoThe name of the model
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
flow_namestring
NoThe name of the flow that the component belongs to.
skipboolean
NoA boolean flag indicating whether to skip processing for the component or not.
data_maintenanceDataMaintenance
NoThe data maintenance configuration options for the component.
skip_for_time_series_runsboolean
NoA boolean flag indicating whether to skip processing for this component in time-series runs.
testsComponentTestColumn
NoDefines tests to run on the data of this component.
custom_python_readCustomPythonReadOptionsYes

Flow

A flow is the primary unit of execution in Ascend and contains a collection of components assembled into a directed acyclic graph (DAG).

PropertyDefaultTypeRequiredDescription
flowFlowOptionsYes

FlowOptions

Defines the options for a Flow

PropertyDefaultTypeRequiredDescription
namestring
NoThe name of the model
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
parametersobject
NoDictionary of parameters to use for resource.
defaultsarray[ConfigFilter]
NoList of default configs with filters that can be applied to a resource config.
data_planeDataPlane
NoData plane to use for the flow.
versionstring
NoThe version of the flow.
bootstrapstring
NoBootstrap command to run within the Docker container.
runnerascendstring
NoRunner id to use for running the flow. defaults to 'ascend'

FlowRun

Defines the run-specific parameters for a Flow, one flow can have multiple Flow runs

PropertyDefaultTypeRequiredDescription
flow_runFlowRunOptionsYes

FlowRunBaseOptions

Base options for a Flow Run

PropertyDefaultTypeRequiredDescription
namestring
NoThe name of the model
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
parametersobject
NoDictionary of parameters to use for resource.
defaultsarray[ConfigFilter]
NoList of default configs with filters that can be applied to a resource config.
run_testsTruebooleanNoA boolean flag indicating whether to run tests after processing the data.
store_test_resultsboolean
NoA boolean flag indicating whether to store the test results.
componentsarray[string]
NoList of component names to run.
component_categoriesarray[string]
NoList of component categories to run.
halt_flow_on_errorboolean
NoA boolean flag indicating whether to halt the flow on error.
disable_optimizersboolean
NoA boolean flag indicating whether to disable optimizers.
disable_incremental_metadata_collectionboolean
NoA boolean flag indicating whether to disable collection incremental RC/Transform metadata.

FlowRunOptions

Options for a Flow Run

PropertyDefaultTypeRequiredDescription
namestring
NoThe name of the FlowRun.
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
parametersobject
NoDictionary of parameters to use for resource.
defaultsarray[ConfigFilter]
NoList of default configs with filters that can be applied to a resource config.
run_testsTruebooleanNoA boolean flag indicating whether to run tests after processing the data.
store_test_resultsboolean
NoA boolean flag indicating whether to store the test results.
componentsarray[string]
NoList of component names to run.
component_categoriesarray[string]
NoList of component categories to run.
halt_flow_on_errorboolean
NoA boolean flag indicating whether to halt the flow on error.
disable_optimizersboolean
NoA boolean flag indicating whether to disable optimizers.
disable_incremental_metadata_collectionboolean
NoA boolean flag indicating whether to disable collection incremental RC/Transform metadata.
flow_namestringYesThe name of the flow that is to be run.
event_start_timestring
NoEvent start time to be used for time-series processing.
event_end_timestring
NoEvent end time to be used for time-series processing.

Profile

A profile is a set of configuration options (and parameters) that define the target where customer code is compiled/run.

PropertyDefaultTypeRequiredDescription
profileProfileOptionsYesOptions (and parameters) for the profile.

ProfileOptions

Configuration options (and parameters) for a profile.

PropertyDefaultTypeRequiredDescription
pip_packagesarray[string]
NoPython PIP packages to install
namestring
NoThe name of the model
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
parametersobject
NoDictionary of parameters to use for resource.
defaultsarray[ConfigFilter]
NoList of default configs with filters that can be applied to a resource config.

Project

A project is a group of related connections, flows/components, profiles, vaults, automations and other code/configuration artifacts. Project files define the mapping of filesystem paths to different kinds of artifacts that the platform can access when running flows for the project.

PropertyDefaultTypeRequiredDescription
projectProjectOptionsYes

ProjectOptions

Options that can be specified for a project.

PropertyDefaultTypeRequiredDescription
pip_packagesarray[string]
NoPython PIP packages to install
namestring
NoThe name of the model
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
parametersobject
NoDictionary of parameters to use for resource.
defaultsarray[ConfigFilter]
NoList of default configs with filters that can be applied to a resource config.
versionstring
NoThe version of the project.
connections['connections/']array[string]
NoList of connection definition folders used in the project.
flows['flows/']array[string]
NoList of flow definition folders used in the project.
profiles['profiles/']array[string]
NoList of profile definition folders used in the project.
sources['src/']array[string]
NoList of source definition folders used in the project.
tests['tests/']array[string]
NoList of test definition folders used in the project.
vaults['vaults/']array[string]
NoList of vault definition folders used in the project.
actions['actions/']array[string]
NoList of action definition folders used in the project.
automations['automations/']array[string]
NoList of automation definition folders used in the project.
sensors['sensors/']array[string]
NoList of sensor definition folders used in the project.
ssh_tunnels['ssh_tunnels/']array[string]
NoList of SSH tunnel definition folders used in the project.
applications['applications/']array[string]
NoList of Application definition folders used in the project.

ConfigFilter

A filter used to target configuration settings to a specific flow and/or component.

PropertyDefaultTypeRequiredDescription
kindstring ("Flow", "Component")YesThe kind of the resource to apply the config to.
nameAny of:
  string
  array[string]
  RegexFilter
  array[RegexFilter]
YesName of the resource to apply the config to.
flow_namestring
NoName of the flow to apply the config to.
specAny of:
  FlowSpec
  ComponentSpec
NoDictionary of parameters to use for the resource.

ComponentSpec

Specification for configuration applied to a component at runtime based on the config filter.

PropertyDefaultTypeRequiredDescription
data_plane  One of:
    SnowflakeDataPlane
    BigQueryDataPlane
    DuckdbDataPlane
    SynapseDataPlane
NoData Plane-specific configuration options for a component.
skipFalsebooleanNo

ReadComponent

A component that reads data from a data system.

PropertyDefaultTypeRequiredDescription
data_plane  One of:
    SnowflakeDataPlane
    BigQueryDataPlane
    DuckdbDataPlane
    SynapseDataPlane
NoData Plane-specific configuration options for a component.
namestring
NoThe name of the model
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
flow_namestring
NoThe name of the flow that the component belongs to.
skipboolean
NoA boolean flag indicating whether to skip processing for the component or not.
data_maintenanceDataMaintenance
NoThe data maintenance configuration options for the component.
skip_for_time_series_runsboolean
NoA boolean flag indicating whether to skip processing for this component in time-series runs.
testsComponentTestColumn
NoDefines tests to run on the data of this component.
readOne of:
  GenericFileReadComponent
  LocalFileReadComponent
  S3ReadComponent
  GcsReadComponent
  AbfsReadComponent
  HttpReadComponent
  MSSQLReadComponent
  MySQLReadComponent
  OracleReadComponent
  PostgresReadComponent
  SnowflakeReadComponent
  BigQueryReadComponent
YesThe read component that reads data from a data system.

TransformComponent

A component that executes SQL or Python code to transform data.

PropertyDefaultTypeRequiredDescription
data_plane  One of:
    SnowflakeDataPlane
    BigQueryDataPlane
    DuckdbDataPlane
    SynapseDataPlane
NoData Plane-specific configuration options for a component.
namestring
NoThe name of the model
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
flow_namestring
NoThe name of the flow that the component belongs to.
skipboolean
NoA boolean flag indicating whether to skip processing for the component or not.
data_maintenanceDataMaintenance
NoThe data maintenance configuration options for the component.
skip_for_time_series_runsboolean
NoA boolean flag indicating whether to skip processing for this component in time-series runs.
testsComponentTestColumn
NoDefines tests to run on the data of this component.
transformOne of:
  SqlTransform
  PythonTransform
  SnowparkTransform
  PySparkTransform
YesThe transform component that executes SQL or Python code to transform data.

BigQueryDataPlaneOptions

PropertyDefaultTypeRequiredDescription
partition_byAny of:
  BigQueryRangePartitioning
  BigQueryTimePartitioning
NoPartition By clause for the table.
cluster_byarray[string]
NoClustering keys to be added to the table.

BigQueryRangePartitioning

PropertyDefaultTypeRequiredDescription
fieldstringYesField to partition by.
rangeRangeOptionsYesRange partitioning options.

BigQueryTimePartitioning

PropertyDefaultTypeRequiredDescription
fieldstringYesField to partition by.
granularitystring ("DAY", "HOUR", "MONTH", "YEAR")YesGranularity of the time partitioning.

RangeOptions

PropertyDefaultTypeRequiredDescription
startintegerYesStart of the range partitioning.
endintegerYesEnd of the range partitioning.
intervalintegerYesInterval of the range partitioning.