Skip to main content
Version: 3.0.0

MSSQL Read Component

A component that reads data from a MSSQL Server database, options include ingesting a single table / query, or multiple tables / queries.

Examples

component:
read:
connection: my-mssql-connection
mssql:
tables:
- name: table1
schema: schema1
- name: table2
schema: schema2
- name: table3
schema: schema3

MSSQLReadComponent

info

MSSQLReadComponent is defined beneath the following ancestor nodes in the YAML structure:

Below are the properties for the MSSQLReadComponent. Each property links to the specific details section further down in this page.

PropertyDefaultTypeRequiredDescription
event_timestring
NoTimestamp column in the component output used to represent event time.
connectionstring
NoThe name of the connection to use for reading data.
columnsarray[ComponentColumn]
NoA list specifying the columns to read from the source and transformations to make during read.
normalizeboolean
NoA boolean flag indicating if the output column names should be normalized to a standard naming convention after reading.
preserve_caseboolean
NoA boolean flag indicating if the case of the column names should be preserved after reading.
uppercaseboolean
NoA boolean flag indicating if the column names should be transformed to uppercase after reading.
strategyAny of:
  string
  IncrementalReadStrategy
  PartitionedStrategy
NoIngest strategy options.
read_optionsDatabaseReadOptions
NoOptions for reading from the database or warehouse.
mssqlAny of:
  SingleTableWithSchema
  MultipleTablesWithSchema
  SingleQuery
  MultipleQueries
Yes

Property Details

Component

A component is a fundamental building block of a data flow. Types of components that are supported include: read, transform, task, test, and more.

PropertyDefaultTypeRequiredDescription
componentOne of:
  ReadComponent
  TransformComponent
  TaskComponent
  SingularTestComponent
  CustomPythonReadComponent
  WriteComponent
  CompoundComponent
  AliasedTableComponent
  ExternalTableComponent
YesConfiguration options for the component.

ReadComponent

A component that reads data from a data system.

PropertyDefaultTypeRequiredDescription
data_plane  One of:
    SnowflakeDataPlane
    BigQueryDataPlane
    DuckdbDataPlane
    SynapseDataPlane
    FabricDataPlane
    DatabricksDataPlane
NoData Plane-specific configuration options for a component.
descriptionstring
NoA brief description of what the model does.
metadataResourceMetadata
NoMeta information of a resource. In most cases it doesn't affect the system behavior but may be helpful to analyze project resources.
namestringYesThe name of the model
flow_namestring
NoThe name of the flow that the component belongs to.
skipboolean
NoA boolean flag indicating whether to skip processing for the component or not.
data_maintenanceDataMaintenance
NoThe data maintenance configuration options for the component.
testsComponentTestOptions
NoDefines tests to run on the data of this component.
readOne of:
  GenericFileReadComponent
  LocalFileReadComponent
  SFTPReadComponent
  S3ReadComponent
  GcsReadComponent
  AbfsReadComponent
  HttpReadComponent
  MSSQLReadComponent
  MySQLReadComponent
  OracleReadComponent
  PostgresReadComponent
  SnowflakeReadComponent
  BigQueryReadComponent
  DatabricksReadComponent
YesThe read component that reads data from a data system.

ComponentColumn

Component column expression definition.

No properties defined.

DatabaseReadOptions

Options for reading from a database or warehouse.

PropertyDefaultTypeRequiredDescription
chunk_size100000integerNoNumber of rows to read from the table at a time.
parallel_readParallelReadOptions
NoOptions for reading from the source in parallel.

IncrementalReadStrategy

Incremental Read Strategy for database read components - this is a combination of the replication strategy that defines how new data is read from the source, and the incremental strategy that defines how this new data is materialized in the output.

PropertyDefaultTypeRequiredDescription
replication  One of:
    Any of:
      string
      CdcReplication
    Any of:
      string
      IncrementalReplication
NoReplication strategy to use for data synchronization.
incrementalAny of:
  string
  MergeStrategy
  SCDType2Strategy
YesIncremental processing strategy.
on_schema_changestring ("ignore", "fail", "append_new_columns", "sync_all_columns")
NoPolicy to apply when schema changes are detected. Defaults to 'fail' if not provided.

CdcReplication

Specifies if Change Data Capture (CDC) is the replication strategy.

PropertyDefaultTypeRequiredDescription
cdcCdcOptions
NoResource for Change Data Capture (CDC), enabling incremental data capture based on changes.

CdcOptions

No properties defined.

IncrementalReplication

Specifies if incremental data reading is the replication strategy.

PropertyDefaultTypeRequiredDescription
incrementalIncrementalColumn
NoResource for incremental data reading based on a specific column.

IncrementalColumn

Specifies the column to be used for incremental reading.

PropertyDefaultTypeRequiredDescription
column_namestringYesName of the column to use for tracking incremental updates to the data.
start_valueAny of:
  string
  integer
  number
  string
NoInitial value to start reading data from the specified column.

MultipleQueries

Options to define one or more arbitrary select statements. The output of the queries will be unioned together, and must return the same database schema.

PropertyDefaultTypeRequiredDescription
queriesarray[string]
NoList of SQL queries to execute for reading data.

MultipleTablesWithSchema

Options for reading from multiple tables in a specific schema.

PropertyDefaultTypeRequiredDescription
tablesarray[TableWithSchemaOptions]YesList of tables (in specified schemas) to read data from.

ParallelReadOptions

Options for reading from a source in parallel. Ascend will logically partition the source data based on the partition column and max partitions, and then read each partition in parallel. If lower and upper bounds for the partition column are provided, they are used as hints to guide partitioning by dividing the range into max_partitions roughly equal partitions.

PropertyDefaultTypeRequiredDescription
partition_columnstringYesThe name of the column to partition the data by. Select a column that could be used to partition the data into smaller chunks and that results in the lowest skew between partitions in terms of record count. This column must either be an integer or timestamp column.
max_partitions-1integerNoThe maximum number of partitions to read concurrently from the source. Since this translates directly to the number of concurrent connections to the source, care should be taken to select a value that does not exceed the source system's connection or other resource limits. A value of -1 means that the value is chosen automatically based on the source.
partition_lower_boundAny of:
  integer
  string
  string
NoThe lower bound of the partition column. If not provided, the minimum value of the partition column will be used.
partition_upper_boundAny of:
  integer
  string
  string
NoThe upper bound of the partition column. If not provided, the maximum value of the partition column will be used.

SingleQuery

Options to define an arbitrary select statement.

PropertyDefaultTypeRequiredDescription
querystringNoSQL query to execute for reading data.

PartitionedStrategy

Partitioned Ingest Strategy. The user is expected to provide 2 functions, a list function that lists partitions in the source, and a read function that reads a partition from the source.

PropertyDefaultTypeRequiredDescription
partitionedPartitionedOptions
NoOptions for partitioning data.
on_schema_changestring ("ignore", "fail", "append_new_columns", "sync_all_columns")
NoPolicy to apply when schema changes are detected. Defaults to 'fail' if not provided.

PartitionedOptions

Options related to partition optimization - in particular, the policy that determines which partitions to ingest.

PropertyDefaultTypeRequiredDescription
enable_substitution_by_partition_namebooleanYesEnable substitution by partition name.
output_typetablestring ("table", "view")NoOutput type for partitioned data. Must be either 'table' or 'view'. This strategy applies only to Transforms.

SCDType2Strategy

The SCD Type 2 strategy allows users to track changes to records over time, by tracking the start and end times for each version of a record. A brief overview of the strategy can be found at https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2:_add_new_row.

PropertyDefaultTypeRequiredDescription
scd_type_2KeyOptions
NoOptions for SCD Type 2 strategy.

MergeStrategy

A strategy that involves merging new data with existing data by updating existing records that match the unique key.

PropertyDefaultTypeRequiredDescription
mergeKeyOptions
NoOptions for merge strategy.

KeyOptions

Column options needed for merge and SCD Type 2 strategies, such as unique key and deletion column name.

PropertyDefaultTypeRequiredDescription
unique_keystringYesColumn or comma-separated set of columns used as a unique identifier for records, aiding in the merge process.
deletion_columnstring
NoColumn name used in the upstream source for soft-deleting records. Used when replicating data from a source that supports soft-deletion. If provided, the merge strategy will be able to detect deletions and mark them as deleted in the destination. If not provided, the merge strategy will not be able to detect deletions.
merge_update_columnsAny of:
  string
  array[string]
NoList of columns to include when updating values in merge. These columns are mutually exclusive with respect to the columns in merge_exclude_columns.
merge_exclude_columnsAny of:
  string
  array[string]
NoList of columns to exclude when updating values in merge. These columns are mutually exclusive with respect to the columns in merge_update_columns.
incremental_predicatesAny of:
  string
  array[string]
NoList of conditions to filter incremental data.

SingleTableWithSchema

Options for reading from a single table in a specific schema.

PropertyDefaultTypeRequiredDescription
tableTableWithSchemaOptionsYesTable (in specified schema) to read data from.

TableWithSchemaOptions

Options for reading from a specific table in a schema.

PropertyDefaultTypeRequiredDescription
namestringYesName of the table to be read.
schemastring
NoSchema of the table, if applicable.