Skip to main content
Version: 3.0.0

Create a Google Cloud Storage Read Component

This guide walks you through creating a Read Component that ingests data from Google Cloud Storage (GCS).

Prerequisites​

Create a new Component​

Begin from your workspace Super Graph view. Follow these steps to create your component:

  1. Double-click the Flow where you want to create your component
  2. Right-click anywhere in the Flow Graph
  3. Hover over Create Component, then over Read in the expanded menu, and click From Scratch menu
  4. Complete the form with these details:
    • Select your Flow
    • Enter a descriptive Component Name like read_sales
    • Select YAML as your file type form

Create your Google Cloud Storage Read Component​

Structure your Google Cloud Storage Read Component following this pattern:

  1. Reference your GCS connection: Specify which GCS connection to read from
  2. Add the gcs key: Configure the specific GCS settings
    • path: Specify the path within the bucket to read from
    • include: Define file patterns to include specific files
  3. Add the parser configuration: Specify file format and parsing options
  4. Additional configuration options: Include any GCS-specific settings

Example​

read_gcs.yaml
component:
read:
connection: read_gcs_lake
gcs:
path: listing/partitioned/year=20
include:
- glob: "**/year=2024/month=0*/**/*.csv"
parser:
csv:
sep: "|"
has_header: true
date_format: "%Y/%m/%d"
timestamp_format: '%Y-%m-%d %H:%M:%S'

This example shows how to:

  • Read from a partitioned directory structure in your GCS bucket
  • Filter files using glob patterns to target specific partitions (2024 data from months starting with '0')
  • Configure a custom CSV parser with pipe delimiter
  • Specify date and timestamp formats for proper parsing

For a complete list of configuration options and advanced settings, see this reference guide.

🎉 Congratulations! You successfully created a Google Cloud Storage Read Component in Ascend.