HTTP Read
For ingesting data from an HTTP endpoint.
Examples
- http_read_csv_uppercase.yaml
- http_read_component_retry.yaml
- http_read_component_custom_columns.yaml
component:
read:
http:
url: "http://example.com/data.csv"
parser: "csv"
uppercase: true
component:
read:
http:
url: "http://api.example.com/data"
parser: "json"
retry_strategy:
stop_after_attempt: 5
stop_after_delay: 600
component:
read:
http:
url: "http://example.com/data"
parser: "json"
columns:
- id
- name
- email
HttpReadComponent
HttpReadComponent
is defined beneath the following ancestor nodes in the YAML structure:
Below are the properties for the HttpReadComponent
. Each property links to the specific details section further down in this page.
Property | Default | Type | Required | Description |
---|---|---|---|---|
dependencies | array[None] | No | List of dependencies that must complete before this Component runs. | |
event_time | string | No | Timestamp column in the component output used to represent event time. | |
connection | string | No | Name of the Connection to use for reading data. | |
columns | array[None] | No | List specifying the columns to read from the source and transformations to make during read. | |
normalize | boolean | No | Boolean flag indicating whether the output column names should be normalized to a standard naming convention after reading. | |
preserve_case | boolean | No | Boolean flag indicating whether the case of the column names should be preserved after reading. | |
uppercase | boolean | No | Boolean flag indicating whether the column names should be transformed to uppercase after reading. | |
http | Yes |
Property Details
Component
A Component is a fundamental building block of a data Flow. Supported Component types include: Read, Transform, Task, Test, and more.
Property | Default | Type | Required | Description |
---|---|---|---|---|
component | One of: CustomPythonReadComponent ApplicationComponent AliasedTableComponent ExternalTableComponent | Yes | Component configuration options. |
ReadComponent
Component that reads data from a system.
Property | Default | Type | Required | Description |
---|---|---|---|---|
data_plane | One of: SnowflakeDataPlane BigQueryDataPlane DatabricksDataPlane | No | Data Plane-specific configuration options for a component. | |
skip | boolean | No | Boolean flag indicating whether to skip processing for the Component or not. | |
retry_strategy | No | Retry strategy configuration options for the Component if any exceptions are encountered. | ||
description | string | No | A brief description of what the model does. | |
metadata | No | Meta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources. | ||
name | string | Yes | The name of the model | |
flow_name | string | No | Name of the Flow that the Component belongs to. | |
data_maintenance | No | The data maintenance configuration options for the Component. | ||
tests | No | Defines tests to run on this Component's data. | ||
read | One of: GenericFileReadComponent LocalFileReadComponent SFTPReadComponent S3ReadComponent GcsReadComponent AbfsReadComponent HttpReadComponent MSSQLReadComponent MySQLReadComponent OracleReadComponent PostgresReadComponent SnowflakeReadComponent BigQueryReadComponent DatabricksReadComponent | Yes | Read component that reads data from a system. |
HttpReadComponentOptions
Options for reading data from an HTTP endpoint.
Property | Default | Type | Required | Description |
---|---|---|---|---|
parser | string ("json", "csv") | Yes | Parser to use for HTTP-based data. Can be one of 'json' or 'csv'. | |
url | string | Yes | URL to retrieve the data from. |
ComponentColumn
Component column expression definition.
No properties defined.
InputComponent
Specification for input Components defining how partitioning behaviors should be handled. This metadata is required when a Component serves as an input to other Components within a Flow.
Property | Default | Type | Required | Description |
---|---|---|---|---|
flow | string | Yes | Name of the parent Flow that the input Component belongs to. | |
name | string | Yes | Name of the input Component. | |
alias | string | No | Alias to use for the input Component. | |
partition_spec | Any of: string ("full_reduction", "map") | No | The type of partitioning to apply to the component's input data before processing the component's logic. Input partitioning is applied before the component's logic is executed. | |
where | string | No | Optional filter condition to apply to the input Component's data. | |
partition_binding | Any of: string | No | Optional partition binding specification to apply to the Component on a per-output-partition basis against other inputs' partitions. |
PartitionBinding
Property | Default | Type | Required | Description |
---|---|---|---|---|
logical_operator | logical_operator | string ("AND", "OR") | No | TLogical operator to use to combine the partition binding predicates provided |
predicates | predicates | array[string] | No | List of partition binding predicates to apply to the input Component's data |
RepartitionSpec
Specification for repartitioning operations on input Component's data
Property | Default | Type | Required | Description |
---|---|---|---|---|
repartition | No | Options for repartitioning the input Component's data. |
RepartitionOptions
Options for repartitioning the input Component's data.
Property | Default | Type | Required | Description |
---|---|---|---|---|
partition_by | string | Yes | Column to partition by. | |
granularity | string | Yes | Granularity to use for the partitioning. |