Skip to main content
Version: 3.0.0

Debug Python in Ascend

Keep your Ascend pipelines running smoothly with these essential Python debugging techniques.

Explain errors with Otto​

Stuck on an error that's blocking your progress? Otto can help.

When you encounter an issue:

  1. Click the error message in the build info panel panel
  2. View the error details in theirβˆ‚βˆ‚ dedicated tab error
  3. Click the sparkle icon in the top right to start an Otto chat in the sidebar sparkle
  4. Get Otto's assistance to resolve the issue otto

Logging​

Monitor Python Component execution with Ascend's built-in logging capabilities to trace code execution and identify issues.

To implement logging, import the log function from the Ascend package:

from ascend.common.events import log

For practical examples, see Otto's Expeditions, our sample project.

The following example demonstrates adding logs to an Incremental Read Component to verify data retrieval beyond the maximum timestamp:

log.py
import polars as pl
import pyarrow as pa
from ascend.application.context import ComponentExecutionContext
from ascend.common.events import log
from ascend.resources import read


@read(
strategy="incremental",
incremental_strategy="merge",
unique_key="id",
on_schema_change="sync_all_columns",
)
def read_metabook(context: ComponentExecutionContext) -> pa.Table:
df = pl.read_parquet("gs://ascend-io-gcs-public/ottos-expeditions/lakev0/generated/events/metabook.parquet/year=*/month=*/day=*/*.parquet")
current_data = context.current_data()
if current_data is not None:
current_data = current_data.to_polars()
max_ts = current_data["timestamp"].max()
log(f"Reading data after {max_ts}")
df = df.filter(df["timestamp"] > max_ts)
else:
log("No current data found, reading all data")

log(f"Returning {df.height} rows")
return df.to_arrow()

Testing​

Implement tests to catch issues before they impact your production pipelines. Well-structured tests help identify problems early and ensure Component reliability.

Common Debugging Strategies​

Incremental Development​

Break down complex Components into smaller functions, testing each one before combining them. This approach makes issues easier to isolate and fix.

Error Handling​

Implement robust error handling to capture and report issues effectively:

try:
result = process_data(input_data)
log.info(f"Successfully processed {len(result)} records")
except Exception as e:
log.error(f"Error processing data: {str(e)}")
# Consider adding traceback information
import traceback
log.error(traceback.format_exc())
raise

Data Inspection​

Log data samples at critical points to verify structure and content:

def transform(data):
log.info(f"Input data sample: {data[:2]}")

# Your transformation logic
result = [transform_record(record) for record in data]

log.info(f"Output data sample: {result[:2]}")
return result

Parameter Validation​

Verify Component parameters early to catch configuration issues:

def read(self, read_options):
# Validate required parameters
if not self.configuration.get('source_table'):
log.error("Missing required parameter: source_table")
raise ValueError("source_table must be specified")

# Continue with read operation
log.info(f"Reading from {self.configuration['source_table']}")

By combining these strategies with Ascend's logging capabilities, you can effectively identify and resolve Python issues in your data pipelines.