HTTP Read
For ingesting data from an HTTP endpoint.
Examples
- http_read_csv_uppercase.yaml
- http_read_component_custom_columns.yaml
- http_read_component_retry.yaml
component:
read:
http:
url: "http://example.com/data.csv"
parser: "csv"
uppercase: true
component:
read:
http:
url: "http://example.com/data"
parser: "json"
columns:
- id
- name
- email
component:
read:
http:
url: "http://api.example.com/data"
parser: "json"
retry:
max_tries: 5
jitter: "random_jitter"
HttpReadComponent
info
HttpReadComponent
is defined beneath the following ancestor nodes in the YAML structure:
Below are the properties for the HttpReadComponent
. Each property links to the specific details section further down in this page.
Property | Default | Type | Required | Description |
---|---|---|---|---|
connection | string | No | The name of the connection to use for reading data. | |
columns | array[ComponentColumn] | No | A list specifying the columns to read from the source and transformations to make during read. | |
normalize | boolean | No | A boolean flag indicating if the output column names should be normalized to a standard naming convention after reading. | |
preserve_case | boolean | No | A boolean flag indicating if the case of the column names should be preserved after reading. | |
uppercase | boolean | No | A boolean flag indicating if the column names should be transformed to uppercase after reading. | |
http | HttpReadComponentOptions | Yes |
Property Details
Component
A component is a fundamental building block of a data flow. Types of components that are supported include: read, transform, task, test, and more.
Property | Default | Type | Required | Description |
---|---|---|---|---|
component | One of: ReadComponent TransformComponent TaskComponent SingularTestComponent CustomPythonReadComponent WriteComponent CompoundComponent AliasedTableComponent ExternalTableComponent | Yes | Configuration options for the component. |
ReadComponent
A component that reads data from a data system.
Property | Default | Type | Required | Description |
---|---|---|---|---|
data_plane | One of: SnowflakeDataPlane BigQueryDataPlane DuckdbDataPlane SynapseDataPlane | No | Data Plane-specific configuration options for a component. | |
name | string | No | The name of the model | |
description | string | No | A brief description of what the model does. | |
metadata | ResourceMetadata | No | Meta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources. | |
flow_name | string | No | The name of the flow that the component belongs to. | |
skip | boolean | No | A boolean flag indicating whether to skip processing for the component or not. | |
data_maintenance | DataMaintenance | No | The data maintenance configuration options for the component. | |
tests | ComponentTestColumn | No | Defines tests to run on the data of this component. | |
read | One of: GenericFileReadComponent LocalFileReadComponent S3ReadComponent GcsReadComponent AbfsReadComponent HttpReadComponent MSSQLReadComponent MySQLReadComponent OracleReadComponent PostgresReadComponent SnowflakeReadComponent BigQueryReadComponent | Yes | The read component that reads data from a data system. |
HttpReadComponentOptions
Options for reading data from an HTTP endpoint.
Property | Default | Type | Required | Description |
---|---|---|---|---|
parser | string ("json", "csv") | Yes | Parser to use for HTTP-based data. Can be one of 'json' or 'csv'. | |
url | string | Yes | URL to retrieve the data from. | |
retry | Retry | No | Retry Resource for HTTP requests. |
Retry
Options for specifying retry behavior.
Property | Default | Type | Required | Description |
---|---|---|---|---|
max_tries | integer | No | Maximum number of retries to attempt. | |
max_time | integer | No | Maximum time to retry in seconds. | |
jitter | string ("random_jitter", "full_jitter") | No | Type of jitter to apply to retry intervals. |
ComponentColumn
Component column expression definition.
No properties defined.