Build a Custom Module
Modules are powerful, reusable groups of derived Components bundled into a single visual unit within your Flow Graph. Custom Modules give you full programmatic control over Component generation, allowing you to implement complex logic that goes beyond what template-based Modules can provide.
When to use Custom Modules
Choose Custom Modules when you need:
- Complex conditional logic for Component generation
- Dynamic number of Components based on configuration
- Custom validation or processing of configuration
- Integration with external systems during build time
- Full control over Component structure and relationships
For simpler use cases with template-based generation, see Blueprint Modules.
Create a Custom Module
1. Define the configuration model
Use Pydantic to define a typed configuration schema:
from pydantic import BaseModel, Field
from typing import Optional
class QualityConfig(BaseModel):
"""Configuration for the data quality Module."""
input_table: str = Field(..., description="Source table to validate")
threshold: int = Field(default=50, description="Quality score threshold")
enable_detailed_checks: bool = Field(default=False, description="Run additional checks")
output_format: Optional[str] = Field(default="parquet", description="Output format")
2. Implement the Module class
3. Configure the Module
Create a YAML file that references your Custom Module:
component:
module:
module_id: data_quality # Matches @module(name="...")
config:
input_table: raw_events
threshold: 75
enable_detailed_checks: true
output_format: json
ModuleBuildContext reference
The context parameter provides access to build-time information:
| Property | Description |
|---|---|
module_component_name | Name of the parent Module Component |
flow_name | Name of the containing Flow |
flow_build_context | Full FlowBuildContext instance |
flow_options | FlowOptions with parameters and configuration |
fully_qualified_component_name(name) | Creates fully qualified sub-component name |
Access Flow parameters
Use context.flow_options.parameters to access Flow and Profile parameters at build time:
def components(self, config: MyConfig, context: ModuleBuildContext):
# Access flow parameters
flow_params = context.flow_options.parameters
# Conditionally generate components based on parameters
if flow_params.get("enable_advanced_features", False):
# Generate additional components
pass
Create fully qualified names
Sub-components use the parent__child naming pattern. Use fully_qualified_component_name() to generate correct names:
def components(self, config: MyConfig, context: ModuleBuildContext):
# If Module is named "my_module", this creates "my_module__transform"
transform_name = context.fully_qualified_component_name("transform")
# Use in component definition
yaml_def = f"""
component:
name: {transform_name}
transform:
sql: SELECT * FROM ...
"""
Reference Components
Reference sibling sub-components
Within a Module, reference other sub-components using fully qualified names:
Reference external Components
Reference Components outside the Module using standard ref():
transform_yaml = f"""
component:
name: {transform_name}
transform:
sql: |
SELECT * FROM {{{{ ref('external_component') }}}}
-- Or from another Flow
UNION ALL
SELECT * FROM {{{{ ref('other_component', flow='other_flow') }}}}
"""
Reference Module sub-components from outside
From other Components in your Flow, reference Module sub-components using fully qualified names:
-- Reference a sub-component of my_module Module
SELECT * FROM {{ ref('my_module__transform') }}
Complete example
Here's a full Custom Module that creates a parameterized ETL pipeline:
Usage:
component:
module:
module_id: etl_pipeline
config:
source_table: raw_sales
destination_schema: analytics
filter_column: amount
filter_value: 0
include_aggregations: true
Best practices
- Use Pydantic Field descriptions: Document parameters with
Field(description="...")for better discoverability - Validate configuration: Add Pydantic validators for complex validation logic
- Single purpose: Each Module should have one focused goal
- Type hints: Use proper type hints for all configuration fields
- Escape Jinja: Use double braces
{{{{ }}}}in f-strings for Jinja expressions - Test locally: Validate your Module logic before deploying
Next steps
- Learn about Blueprint Modules for template-based generation
- Review the Modules concept guide