Skip to main content
Version: 3.0.0

BigQuery Write Component

A component that writes data to a BigQuery table.

Examples

component:
write:
connection: my_connection
input:
name: data_input
flow: my_flow
bigquery:
table:
name: my_table
dataset: my_dataset

BigQueryWriteComponent

info

BigQueryWriteComponent is defined beneath the following ancestor nodes in the YAML structure:

Below are the properties for the BigQueryWriteComponent. Each property links to the specific details section further down in this page.

PropertyDefaultTypeRequiredDescription
connectionstringYesThe name of the connection to use for writing data.
inputInputComponentYesInput component name.
normalizeboolean
NoA boolean flag indicating if the output column names should be normalized to a standard naming convention when writing.
preserve_caseboolean
NoA boolean flag indicating if the case of the column names should be preserved when writing.
uppercaseboolean
NoA boolean flag indicating if the column names should be transformed to uppercase when writing.
strategyfull:
  mode: drop_and_recreate
Any of:
  string ("snapshot")
  FullWriteStrategy
  IncrementalWriteStrategyWithSchemaChange
  PartitionedWriteStrategyWithSchemaChange
NoResource for write strategy.
bigquerySingleTableWithDatasetYes

Property Details

Component

A component is a fundamental building block of a data flow. Types of components that are supported include: read, transform, task, test, and more.

PropertyDefaultTypeRequiredDescription
componentOne of:
  ReadComponent
  TransformComponent
  TaskComponent
  SingularTestComponent
  CustomPythonReadComponent
  WriteComponent
  CompoundComponent
  AliasedTableComponent
  ExternalTableComponent
YesConfiguration options for the component.

WriteComponent

PropertyDefaultTypeRequiredDescription
namestring
NoThe name of the model
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
flow_namestring
NoThe name of the flow that the component belongs to.
skipboolean
NoA boolean flag indicating whether to skip processing for the component or not.
skip_for_time_series_runsboolean
NoA boolean flag indicating whether to skip processing for this component in time-series runs.
writeOne of:
  BigQueryWriteComponent
  SnowflakeWriteComponent
  S3WriteComponent
  MySQLWriteComponent
  OracleWriteComponent
Yes

SingleTableWithDataset

Options for reading from a single table in a specific dataset. Useful for platforms like BigQuery.

PropertyDefaultTypeRequiredDescription
tableTableWithDatasetOptionsYesTable (in specified dataset) to read data from.

FullWriteStrategy

Container for specifying the incremental write strategy.

PropertyDefaultTypeRequiredDescription
fullFullWriteStrategyOptionsYesOptions to use when fully writing data to a Write component.

FullWriteStrategyOptions

Resource options for full writes, including mode selection.

PropertyDefaultTypeRequiredDescription
modeFullWriteModeEnumYesSpecifies the mode to use when fully writing data: 'drop_and_recreate' to drop the output table and recreate it.

FullWriteModeEnum

No properties defined.

IncrementalWriteStrategyWithSchemaChange

Container for specifying the incremental write strategy that supports different behaviors when schema changes.

PropertyDefaultTypeRequiredDescription
incrementalIncrementalWriteStrategyOptionsYesOptions to use when incrementally writing data to a Write component.
on_schema_changestring ("ignore", "fail", "drop_and_recreate", "append_new_columns", "sync_all_columns")
NoPolicy to apply when schema changes are detected.

IncrementalWriteStrategyOptions

Resource options for incremental writes, including mode selection and criteria for detecting deletions and unique records.

PropertyDefaultTypeRequiredDescription
modeIncrementalWriteModeEnumYesSpecifies the mode to use when incrementally writing data: 'append' to append new or modified records, 'upsert' to insert new records and update modified records, and 'sync' to encompass both 'upsert' functionality and to delete records when deleted at the source.
columnstringYesName of the column to use for tracking incremental updates to the data.
change_detectionUniqueKey
NoOptions for detecting record changes when comparing source and destination data.
deletion_colstring
NoColumn name to use for identifying deleted records when 'soft deletion' is used at the source.

IncrementalWriteModeEnum

No properties defined.

InputComponent

Specification for input components, including how partitioning behaviors should be handled. This additional metadata is required when a component is used as an input to other components in a flow.

PropertyDefaultTypeRequiredDescription
flowstringYesName of the parent flow that the input component belongs to.
namestringYesThe input component name.
aliasstring
NoThe alias to use for the input component.
partition_specAny of:
  string ("full_reduction", "map")
  RepartitionSpec
NoThe type of partitioning to apply to the component's input data before processing the component's logic. Input partitioning is applied before the component's logic is executed.
wherestring
NoAn optional filter condition to apply to the input component's data.
partition_bindingAny of:
  string
  PartitionBinding
NoAn optional partition binding specification to apply to the component on a per-output-partition basis against other inputs' partitions.

PartitionBinding

PropertyDefaultTypeRequiredDescription
logical_operatorlogical_operatorstring ("AND", "OR")NoThe logical operator to use to combine the partition binding predicates provided
predicatespredicatesarray[string]NoThe list of partition binding predicates to apply to the input component's data

PartitionedWriteStrategyWithSchemaChange

Container for specifying the partitioned write strategy that supports different behaviors when schema changes.

PropertyDefaultTypeRequiredDescription
partitionedPartitionedWriteStrategyOptionsYesOptions to use when writing data in partitions to a Write component.
on_schema_changestring ("ignore", "fail", "drop_and_recreate", "append_new_columns", "sync_all_columns")
NoPolicy to apply when schema changes are detected.

PartitionedWriteStrategyOptions

Resource options for incremental writes, including mode selection and criteria for detecting deletions and unique records.

PropertyDefaultTypeRequiredDescription
modePartitionedWriteModeEnumYesSpecifies the mode to use when writing data in partitions: 'append' to append new or modified partitions, 'insert_overwrite' to insert new partitions and replace/overwrite modified partitions, and 'sync' to encompass both 'insert_overwrite' functionality and to delete partitions when deleted at the source.
partition_colstring
NoColumn name used for partitioning, uses the internal Ascend partition identifier by default.

PartitionedWriteModeEnum

No properties defined.

RepartitionSpec

Specification for repartitioning operations on input component's data

PropertyDefaultTypeRequiredDescription
repartitionRepartitionOptions
NoOptions for repartitioning the input component's data.

RepartitionOptions

Options for repartitioning the input component's data.

PropertyDefaultTypeRequiredDescription
partition_bystringYesThe column to partition by.
granularitystringYesThe granularity to use for the partitioning.

TableWithDatasetOptions

Options for reading from a specific table in a dataset.

PropertyDefaultTypeRequiredDescription
namestringYesName of the table to be read.
datasetstring
NoDataset of the table, specific to platforms like BigQuery.

UniqueKey

PropertyDefaultTypeRequiredDescription
unique_keystringYesColumn or set of columns used as a unique identifier for records.