Skip to main content
Version: 3.0.0

Python Task Component

TaskPythonComponent

info

TaskPythonComponent is defined beneath the following ancestor nodes in the YAML structure:

Below are the properties for the TaskPythonComponent. Each property links to the specific details section further down in this page.

PropertyDefaultTypeRequiredDescription
event_timestring
NoTimestamp column in the component output used to represent event time.
dependenciesarray[InputComponent]
NoList of dependencies for the generic component.
pythonPythonTransformComponentNo

Property Details

Component

A component is a fundamental building block of a data flow. Types of components that are supported include: read, transform, task, test, and more.

PropertyDefaultTypeRequiredDescription
componentOne of:
  ReadComponent
  TransformComponent
  TaskComponent
  SingularTestComponent
  CustomPythonReadComponent
  WriteComponent
  CompoundComponent
  AliasedTableComponent
  ExternalTableComponent
YesConfiguration options for the component.

TaskComponent

PropertyDefaultTypeRequiredDescription
namestring
NoThe name of the model
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
flow_namestring
NoThe name of the flow that the component belongs to.
skipboolean
NoA boolean flag indicating whether to skip processing for the component or not.
data_maintenanceDataMaintenance
NoThe data maintenance configuration options for the component.
skip_for_time_series_runsboolean
NoA boolean flag indicating whether to skip processing for this component in time-series runs.
testsComponentTestColumn
NoDefines tests to run on the data of this component.
taskOne of:
  TaskSqlComponent
  TaskPythonComponent
Yes

PythonTransformComponent

Python transform function to execute for transforming the data.

PropertyDefaultTypeRequiredDescription
entrypointstringYesThe entrypoint for the python transform function.
sourcestringYesThe source file for the python transform function.

InputComponent

Specification for input components, including how partitioning behaviors should be handled. This additional metadata is required when a component is used as an input to other components in a flow.

PropertyDefaultTypeRequiredDescription
flowstringYesName of the parent flow that the input component belongs to.
namestringYesThe input component name.
aliasstring
NoThe alias to use for the input component.
partition_specAny of:
  string ("full_reduction", "map")
  RepartitionSpec
NoThe type of partitioning to apply to the component's input data before processing the component's logic. Input partitioning is applied before the component's logic is executed.
wherestring
NoAn optional filter condition to apply to the input component's data.
partition_bindingAny of:
  string
  PartitionBinding
NoAn optional partition binding specification to apply to the component on a per-output-partition basis against other inputs' partitions.

PartitionBinding

PropertyDefaultTypeRequiredDescription
logical_operatorlogical_operatorstring ("AND", "OR")NoThe logical operator to use to combine the partition binding predicates provided
predicatespredicatesarray[string]NoThe list of partition binding predicates to apply to the input component's data

RepartitionSpec

Specification for repartitioning operations on input component's data

PropertyDefaultTypeRequiredDescription
repartitionRepartitionOptions
NoOptions for repartitioning the input component's data.

RepartitionOptions

Options for repartitioning the input component's data.

PropertyDefaultTypeRequiredDescription
partition_bystringYesThe column to partition by.
granularitystringYesThe granularity to use for the partitioning.