Skip to main content

πŸ“’ Changelog

πŸ—“οΈ 2025-11-03​

πŸš€ Features​

  • πŸ€–πŸ New Otto capabilities:
    • Otto agent configuration now supports agent overrides, custom instructions, and configurable max_turns settings via otto.yaml, including commit message generation with context integration.
  • Write Components now support full refresh operations for object storage (GCS, S3, ABFS), BigQuery, MySQL, Oracle, and PostgreSQL Connection types. Full refresh resets and rebuilds destination tables from scratch.

🌟 Improvements​

  • Snowflake authentication workflows updated to use key-pair authentication as password-based auth approaches deprecation.

πŸ› οΈ Bug fixes​

  • Flow run errors now include the actual error message instead of generic timeout or segfault messages.
  • Fixed DuckDB Connections to PostgreSQL metadata failing with no password supplied errors due to race conditions. Atomic file replacement ensures complete password files are written before reading.
  • Fixed Flow Components in subdirectories not displaying run history due to path resolution issues.
  • Fixed Automation run error display formatting. Error messages now render with resize handles and copy buttons.
  • Fixed editor focus behavior when saving files or accepting AI changes.

πŸ—“οΈ 2025-10-27​

πŸš€ Features​

  • πŸ€–πŸ New Otto capabilities:
    • Otto now generates commit messages using Conventional Commits with type prefixes, scopes, and breaking change indicators.
    • Otto agent system updated with improved context gathering that traces symbols and enhanced tool descriptions for error-prone operations like RegEx escaping.
    • Otto displays inline file diffs consistently across all AI model providers.
  • SQLFrame DataFrame support added across BigQuery, Snowflake, Databricks, and DuckDB Data Planes with Spark-like API and lazy evaluation.
  • Project defaults now support Flow runner size configurations using RegEx patterns to match Flow names.

🌟 Improvements​

  • Added syntax highlighting for .ascendignore and other ignore files.
  • Snowflake authentication updated to use key pair authentication instead of deprecated password method.
  • Increased MySQL connection pool size to handle higher concurrent loads, reducing timeout errors for Instance-to-cloud GRPC communications.
  • Workspace pane automatically closes docked views when no tabs are present.
  • Deployment names in merge menus now match labels used elsewhere in the UI.
  • Tab loading logic enhanced with AbortController support for better resource management and error handling.

πŸ› οΈ Bug fixes​

  • Otto fixes:
    • Fixed patch parser to detect and correct malformed patch endings with extra + characters.
    • Fixed Otto Chat title generation failing silently.
    • Fixed commit message generation wrapping text in triple backticks or including commentary. Now outputs clean text without formatting artifacts.
  • Fixed empty data tables being dropped during maintenance, breaking downstream Components. Tables referenced by committed partitions are now protected from deletion.
  • Fixed table creation race conditions causing Clustering Key has changed errors. Unified SQL generation with DDL locking ensures consistent behavior across Data Planes.
  • Fixed Run Flow button text not updating when builds became out-of-date.
  • Fixed delete dialogs remaining open after successful deletions.
  • Improved local storage error handling for corrupted data, quota exceeded errors, and other storage failures.

πŸ—“οΈ 2025-10-20​

πŸš€ Features​

  • πŸ€–πŸ New Otto capabilities:
    • Otto communication style updated to remove filler phrases like "Certainly!" and "Of course!", with improved context gathering and tool descriptions for error-prone operations.
  • Projects now support .ascendignore files to exclude parts of your Project during development.
    • Profiles also support an ignore field for additional ignore patterns. ignore

🌟 Improvements​

  • Smart Schema support expanded:
    • MySQL, Oracle, PostgreSQL, Microsoft SQL Server, BigQuery, Snowflake, and SFTP Read Components now support Smart Schema.
    • Smart SQL Transforms, Smart Python Transforms, and Custom Python Transforms now support Smart Schema.

πŸ› οΈ Bug fixes​

  • Fixed Component tests with severity: warn causing hard failures. Test severity parsing now handles all configuration patterns correctly.
  • Fixed authorization checks hitting rate limits during rapid UI interactions. Authorization checks are now cached.
  • Fixed polling being lost when switching browser tabs. Polling now pauses and resumes based on tab visibility.
  • Fixed Project page crashing with Cannot read properties of undefined errors when refreshing. Page now handles empty states and removed unnecessary polling.
  • Fixed Workspace tours breaking when encountering auto-snoozed Workspaces. Tours now detect Workspace state and prompt users to start paused Workspaces.

πŸ—“οΈ 2025-10-13​

πŸš€ Features​

🌟 Improvements​

πŸ› οΈ Bug fixes​

  • Fixed concurrent DuckDB operations causing segmentation faults due to missing thread locks. Added locking for DDL operations, query execution, and batch insertions.
  • Fixed Smart Schema components with NULL partitioning versions failing during concurrent backfills with Invalid partitioning version: None errors.
  • Fixed DuckLake Connections on S3 failing after 1 hour due to expired AWS credentials. DuckLake now automatically refreshes credentials every hour.
  • Fixed multiple UI memory leaks from Monaco editors, polling handlers, and window event listeners.
  • Fixed paused Workspaces timing out when attempting to load Git status, causing 503 errors. Paused Workspaces now display a message instead of attempting to load Git status. workspace

πŸ—“οΈ 2025-10-06​

πŸš€ Features​

🌟 Improvements​

  • DuckDB now defaults to max_combined_sql_statements=1 when using DuckLake for better performance and resource utilization. Prevents memory issues and CPU usage inefficiencies.
  • Build performance improved by up to 67% for large projects through global Jinja2 template caching, file I/O optimization, and threading improvements.
  • Otto Bedrock integration now supports prompt caching for Anthropic models, reducing costs and latency.
  • Flow runner resource allocation now supports configurable size overrides for CPU, memory, and disk allocation per runner.

πŸ› οΈ Bug fixes​

  • Fixed MCP tool call responses failing due to serialization issues with complex data types. JSON serialization now handles all data types correctly.
  • Fixed segmentation fault in DuckLake by implementing safer check for partitioning version column existence.

πŸ—“οΈ 2025-09-29​

πŸš€ Features​

  • πŸ€–πŸ New Otto capabilities:
  • Added guided tour for first-time users. tour

🌟 Improvements​

  • File upload memory management improved with single-threaded parquet conversion to prevent OOM crashes.
  • DuckDB DDL operations now properly locked to prevent race conditions when multiple operations run with task-threads > 1.
  • Documentation site redesigned with updated visual styling. docs redesign

πŸ› οΈ Bug fixes​

  • Fixed BigQuery Read Components with column casting throwing syntax errors due to missing fully qualified table names.
  • Fixed BigQuery Flows with concurrent components writing to the same table failing with Transaction is aborted due to concurrent update errors. Added automatic retry logic with exponential backoff.
  • Fixed COALESCE expressions in partitioned Data Plane operations causing SQL type inference issues. All COALESCE expressions now use explicit CAST(NULL AS {type}) syntax.
  • Fixed adding columns to existing Databricks tables failing with [PARSE_SYNTAX_ERROR] errors. Column addition now works correctly across all Data Planes.

πŸ—“οΈ 2025-09-22​

πŸš€ Features​

  • πŸ€–πŸ New Otto capabilities:
    • Otto now supports LLMs across providers with Instance-level model management settings. Configure under AI & Models. AI & models settings
    • Otto now works with Jinja macros, parameters, Automations, Tasks, and SSH Tunnels with updated rules and reduced token usage.
  • Smart schema evolution is now the default schema change strategy for Read Components with object storage and Local File Connections on all Data Planes. Data partitions store their own schemas and reconcile type differences without copying data on schema changes.
    • Existing Components continue to use the full strategy. New Components default to smart strategy.
  • Workspace auto-snooze automatically pauses inactive Workspaces after a configurable timeout (5-120 minutes), with detection of ongoing Flow runs to avoid interruption.
    • Auto-snooze applies to all new Workspaces with a default timeout of 10 minutes.

🌟 Improvements​

  • Focus on a Component in the build panel to jump to its location in the Flow Graph, or on a Flow to jump to its location in the Super Graph. focus
  • Data Plane Connections now appear in Flow Connections list. Build info panel displays the number of Connections used in a Flow. folder
  • Lists throughout Ascend Instance now sort alphabetically including Deployment, Project, Git branch, Profile, and Environment selectors.

πŸ› οΈ Bug fixes​

  • PostgreSQL Read Components now preserve array and JSON column types instead of converting to VARCHAR. Added arrays_as_json parameter for non-mappable array data.
  • Fixed incremental merger resets failing with malformed SQL syntax errors. Qualified table names now generate proper DROP statements across Data Planes.

πŸ—“οΈ 2025-09-15​

⚠️ Breaking changes​

  • GCS and ABFS Connections now use more stable metadata fields (md5Hash, crc32c for GCS; content_md5, etag for ABFS) for fingerprinting to prevent unnecessary re-ingests when storage tiers change. Requires re-ingesting existing data.

🌟 Improvements​

  • Deployment run pages now link directly to individual Flows instead of the Super Graph.
  • Flex Code Flow creation forms streamlined to include only essential fields (name, description, Data Plane Connection, parameters, and defaults). folder
  • BigQuery Data Plane replaced custom UDF logic with native IF functions for better performance.
  • DuckLake on S3 now uses httpfs as the default Connection method.
  • Added local file caching support for DuckLake with S3, GCS, and Azure Blob Storage.

πŸ› οΈ Bug fixes​

  • Fixed DuckDB SAFE_STRING macro creation failing in DuckLake environments with expression depth limit errors. Added Python UDF fallback.
  • Fixed long-running DuckLake jobs losing connection to metadata database.
  • Fixed DuckDB-based ingestion not working with multiple PostgreSQL Components simultaneously.

πŸ—“οΈ 2025-09-08​

πŸš€ Features​

  • Added drag-and-drop file upload from computer to file tree.

🌟 Improvements​

  • Moved Instance status indicator to right side of header next to user menu. folder

πŸ› οΈ Bug fixes​

  • Fixed ascend view connection sample command failing with pyarrow.lib.ArrowInvalid error when PostgreSQL tables contain range type columns. Command now converts range types to string representations.
  • Fixed incremental reads not honoring start values during backfill phases.

πŸ—“οΈ 2025-09-01​

πŸš€ Features​

  • MySQL and PostgreSQL databases hosted by AWS can now authenticate using AWS IAM roles.

πŸ› οΈ Bug fixes​

  • Fixed Databricks SQL connector compatibility issue due to renamed API. System now handles both API versions with fallback logic.
  • Fixed partition_template functionality in blob Write Components. Datetime templates are now properly parsed and interpolated.

πŸ—“οΈ 2025-08-25​

πŸš€ Features​

  • πŸ€–πŸ New Otto capabilities: Otto can now be used as an Action in Automations to respond to Flow events, analyze data patterns, and execute actions across multiple platforms:

    automations/otto-notifications.yaml
    automation:
    name: otto-notifications
    enabled: true
    triggers:
    events:
    - sql_filter: json_extract_string(event, '$.data.flow') = 'transform-demo'
    types:
    - FlowRunError
    - FlowRunSuccess

    actions:
    - type: run_otto
    name: run-otto
    config:
    prompt: >-
    # Instructions

    You are being invoked at the end of a flow run. Your job is to assess
    changes in flow run behavior, and send a Slack message to keep other users updated.
    You cannot ask for help or clarification; you must do this end-to-end on your own based
    on your own judgement. Your last action should always be to post a message to Slack.

    ## Flow run analysis rules

    Analyze flow success / failure by fetching the maximum number of flow runs.

    1) If the flow run was a success and the previous flow run was also a
    success, send a
    funny message congratulating the team on another smooth run.

    2) If the flow run was a success and the previous flow run was not,
    send an exciting
    and congratulatory messages celebrating the flow is now fixed, and congratulating
    whichever user did the fixing (check git history).

    3) If the flow run was a failure and the previous flow run was also a
    failure, send a sad
    message imploring somebody to fix it.

    4) If the flow run was a failure and the previous flow run was not,
    send a big alert message.
    You should try to identify who broke the flow, and even @ mention them if you can. You
    should also try to analyze the changes and suggest fixes if possible.

    Additionally, you should make heavy use of git history and file reading calls to help track
    down details that may be relevant.

    ## Slack message and notification

    At the end of your analysis, you must post a message to Slack in the appropriate channel.

    Your slack post message should:
    - include a link to the flow run:
    https://[runtime.link_url]/flows/[flow_name]/[run_name]
    - be formatted for Slack rendering, which is different than markdown
    - when @ mentioning users, look up their profile so you can reference their actual id

🌟 Improvements​

  • Application Components can now access Flow parameters during build time, enabling conditional Component generation based on Profile and Flow parameter values.
  • Enhanced Automation configuration UI with improved layout on small screens and delete buttons on Sensor, Event, and Action cards.

πŸ—“οΈ 2025-08-18​

πŸš€ Features​

⚠️ Breaking changes​

  • Fixed Oracle connection configuration validation to prevent runtime failures. Added model-level validation to ensure mutually exclusive Oracle Connection parameters (dsn, database, service_name) are handled correctly.
    • Users with existing Oracle connections using both database and service_name fields must choose one Connection method.

πŸ› οΈ Bug fixes​

  • Fixed landing table duplicate data issue in incremental persist. Added de-duplication logic to prevent UPDATE/MERGE must match at most one source row for each target row errors.

πŸ—“οΈ 2025-08-11​

πŸš€ Features​

  • Ascend now supports DuckDB via DuckLake as a Data Plane. Follow our guide to get started.

🌟 Improvements​

  • Database incremental reads now process up to 50% faster through concurrent download and upload operations using threading.

πŸ› οΈ Bug fixes​

  • Fixed Failed to list resources errors when browsing Connection data containing variable placeholders in file paths.
  • Fixed incremental merge queries failing with syntax errors when column names contained special characters or reserved keywords. Column names are now properly quoted.
  • Fixed job failures and schema inconsistencies when running Snowflake DML operations concurrently or with metadata drift. Added protection from concurrency limits and error detection for schema inconsistencies.

πŸ—“οΈ 2025-08-04​

🌟 Improvements​

  • Settings interface updated with unified design and consistent card styling.

πŸ› οΈ Bug fixes​

  • Fixed PostgreSQL Read Component incremental strategy memory issue where Components were loading entire tables instead of empty tables during initial runs. Predicates from min/max queries are now properly applied to data fetching queries.

πŸ—“οΈ 2025-07-28​

πŸš€ Features​

🌟 Improvements​

  • Added light/dark mode toggle in General settings.

πŸ› οΈ Bug fixes​

  • Fixed Incremental Read Components with full-refresh flag returning empty results. Full-refresh now properly resets both output and metadata tables.

πŸ—“οΈ 2025-07-21​

πŸš€ Features​

  • πŸ€–πŸ New Otto capabilities:
    • Otto now provides code completion suggestions in the UI, powered by context from rules and files. Otto code complete
    • Otto can now be configured with Automations to send activity summary emails. See our how-to guide for instructions.
  • Files panel now supports drag-and-drop for existing files.
    • Hold option/alt key while dragging to create copies.
  • Files panel right-click menu now includes Duplicate option.

⚠️ Breaking changes​

  • Default Write Component strategy changed from Smart (partitioned) to Simple. Smart Components are more scalable but more complex for simple datasets.

πŸ› οΈ Bug fixes​

  • Fixed SSH public key input box appearing for all secret fields instead of SSH-specific ones only.
  • Fixed date indicators in Flow runs timeline view being unstable due to CSS issues.
  • Fixed BigQuery Transform queries failing when referencing newly created columns in the same query. Column references are now automatically moved to a separate CTE.

πŸ—“οΈ 2025-07-07​

πŸš€ Features​

  • πŸ€–πŸ New Otto capabilities:

    • Otto now supports custom agents for creating specialized behaviors tailored to your workflows. Define agents in markdown with custom traits and tool access.

      • Tool inclusion supports wildcards: [category].* for all tools of a specific type, or * for everything. Example:
      otto/agents/custom_agent.md
      ---
      otto:
      agent:
      name: Standup Reporter
      mcp_servers:
      - slack
      model: gpt-4o
      model_settings:
      temperature: 0.1
      tools:
      - "*"
      ---

      # Standup Reporter

      You are a professional activity reporter specialized in summarizing Ascend platform work.

      Your job is to create comprehensive weekly summaries of my Ascend activities and share them with my manager via Slack.
      ...
    • Otto introduces custom rules for adding special instructions for chat interactions.

    • Otto now connects to external MCP servers via configuration files.

    • Otto's configuration now uses otto.yaml file in a dedicated otto directory for centralized agent and MCP server configuration.

🌟 Improvements​

  • Write Components for blob storage (ABFS, GCS, S3) now support configurable chunk size using the part_file_rows field to optimize performance and avoid OOM errors:
write_custom_chunk.yaml
component:
write:
connection: write_s3
input:
name: my_component
flow: my_flow
s3:
path: /some_other_dir/
formatter: json
part_file_rows: 1000

πŸ› οΈ Bug fixes​

  • Fixed database Read Components (PostgreSQL, MySQL, Microsoft SQL Server, Oracle) converting SQL null values to the literal string 'None' when loading query results into PyArrow arrays. Null values are now properly preserved.
    • This change affects all pipelines using database Read Components across all Data Planes.
    • If you previously added cleanup steps to handle 'None' string values, you can remove them.

πŸ—“οΈ 2025-06-23​

πŸš€ Features​

  • πŸ€–πŸ New Otto capabilities: Otto now renders tool calls, showing which tools are called, their inputs, and outputs.

🌟 Improvements​

  • Write Components now support improved performance for large uploads through chunked blob storage:
    • Paths ending with a filename (e.g., .parquet) β†’ single file output (unchanged)
    • Paths ending with a trailing slash (/) β†’ chunked output with part_<chunkid> format
    • Partition writes now always produce chunked output files
  • Flows can now be run from Deployment history pages when on the latest Project build.

πŸ› οΈ Bug fixes​

  • Fixed action failures in Automations interrupting the entire workflow. Individual action failures are now isolated.

πŸ—“οΈ 2025-06-16​

πŸš€ Features​

  • Private Python package repositories now supported for AWS CodeArtifact and GCP Artifact Registry via pyproject.toml files. See how-to guide for setup instructions.

🌟 Improvements​

  • Instance header now displays error when Instance backend is unhealthy: instance unhealthy
  • Improved error messages for OOM process terminations:
Flow runner process for my_flow/fr-01977f3c-845b-72a1-b225-4327835f8434 exited with code -9.
Exit code -9 usually means the kernel's Out-Of-Memory (OOM) killer terminated the process after it exceeded its memory limit.
Consider increasing the runner size or optimizing the workflow's memory usage.

πŸ› οΈ Bug fixes​

  • Fixed build history calendar days outside the current month not being selectable.

πŸ—“οΈ 2025-06-09​

πŸ› οΈ Bug fixes​

  • Fixed Workspace settings not clearing old run history when switching Projects.
  • Fixed status bar on Explore tab showing outdated information. Now updates in real-time.
  • Fixed Workspace settings URLs not navigating to correct pages.
  • Fixed Workspace or Deployment size settings not being respected.
  • Improved "Flow runner not found" error messages to include detailed pod status information (OOM, timeout, unexpected states, API/connection errors).
  • Fixed Flow runners staying active when idle. Now automatically shut down when not in use.
  • Fixed Git operations hanging indefinitely when repositories became unresponsive. Now timeout automatically.
  • Fixed Application parameters not working with Incremental logic in SQL transforms.
  • Fixed partition filter and partition value analyzers interfering with each other during concurrent backfills. Now run independently.
  • Fixed text parser crashing when given string file paths in Arrow and pandas reads.

πŸ—“οΈ 2025-06-02​

πŸš€ Features​

  • πŸ€–πŸ Otto can now provide SQL linting information via SQLFluff.
  • Command-K search now supports navigation to settings pages.

🌟 Improvements​

  • Improved UI rendering of errors and warnings.
  • Added Polars DataFrame support as input/output format in Python transforms.
  • Parquet processing now normalizes column names to avoid case conflicts.
  • Automation sensors now support timezone configuration.
  • Automation failed status now shown when there is no associated Flow run.

πŸ› οΈ Bug fixes​

  • Fixed column case sensitivity handling for Python Read Components on Snowflake.
  • Fixed Profile name, Project UUID, and path not being persisted into build info.
  • Fixed Transformations with upstream Incremental Read Components not resolving to correct table name.
  • Improved Unicode and emoji handling.
  • Fixed Component/Run state carrying over from current builds into historical ones.
  • Added timezone information to timestamps across the UI.
  • Fixed record columns not being resizable in Explore view.

πŸ—“οΈ 2025-05-26​

πŸš€ Features​

  • Added syntax highlighting for INI, TOML, and SQLFluff files.
  • AWS Managed Vault now supported as Instance or Environment Vault.
  • πŸ€–πŸ New Otto capabilities:
    • Otto can now validate YAML files and test Connections.
    • Otto automatically configures tools based on environment context (Workspace vs. Deployment).

🌟 Improvements​

  • Streamlined Python interfaces in Simple Applications for easier migration from regular Flows.
  • Added copy button to Connection test errors.

πŸ› οΈ Bug fixes​

  • Fixed multi-line comment parsing in SQL and SQL Jinja files.
  • Fixed compatibility issues between .yaml and .yml file extensions.
  • Optimized error stack traces to be more concise.
  • Fixed PostgreSQL Connections SSL verification and empty port configuration handling.
  • Fixed clickable error links in Component cards within Deployments.
  • Fixed arrow keys not working when renaming files.

πŸ—“οΈ 2025-05-19​

πŸš€ Features​

  • πŸ€–πŸ New Otto capabilities:
    • Project file health checks including YAML linting and fixes
    • Connection testing and issue resolution
    • Connection listing and exploration
    • Flow and Component execution, monitoring, and troubleshooting
  • Components can now have non-data graph dependencies to any other Component.
  • Added Git status indicators to file browser and tab bar.
  • Retry logic now configurable for all Component types.
  • Added pause-all functionality for Deployment Automations.

🌟 Improvements​

  • Runs table column widths now remember user preferences.
  • Improved Automation form workflow.
  • Component build errors now provide more specific information about which Component failed.

πŸ› οΈ Bug fixes​

  • Fixed Databricks Connections not reconnecting when encountering "Database Invalid session" errors.
  • Fixed file refresh not updating cached and open files.
  • Fixed repo save button state not reflecting current state correctly.
  • Improved branch listing operation reliability.
  • Fixed zooming on individual nodes in expanded Application Component graph tab.
  • Fixed Databricks Connections catalog reference handling.
  • Fixed UI state consistency issues during rapid save, build, and run operations.
  • Fixed MySQL Connections with SSL=True throwing exceptions.
  • Fixed table references not using full names in merge operations.
  • Improved detection and handling of build failures caused by OOM conditions.
  • Fixed empty files causing Project builds to fail.

Introducing Agentic Data Engineering: meet Otto!​

Ascend is the industry's first Agentic Data Engineering platform, empowering teams to build and manage data pipelines faster, safely, and at scale. With Ascend's platform, engineers benefit from the assistance of context-aware AI agents that deeply understand their data pipelines.

Meet Otto, the intelligent data engineering agent designed to eliminate repetitive tasks, accelerate innovation, and enable faster development cycles. With Otto, you can:

  • Understand data lineage across your entire pipeline with column-level tracing
  • Transforms between frameworks with automatic code migration
  • Implement robust data quality tests with intelligent recommendations

Discover these capabilities and many more!

➑️ Ready to agentify your data engineering experience? Schedule a demo to see Ascend in action.

Introducing Ascend's third generation platform (Gen3)​

☁️ Gen3 is a ground-up rebuild of the Ascend platform, designed to give you more control, greater scalability, and deeper visibility across your data workflows. It's everything you already love about Ascend – now faster, more flexible, and more extensible.

  • Ascend's new Intelligence Core combines metadata, automation, and AI in a layered architecture, empowering all teams to build pipelines faster and significantly compress processing times.

  • Git-native workflows bring version control, collaboration, and CI/CD alignment to all teams through our Flex Code architectureβ€” empowering both low-code users and developers to contribute.

  • Observability features expose detailed pipeline metadata so teams have deeper visibility into their system to diagnose problems quickly, reduce manual investigation, and optimize system behavior.

  • Modular architecture empowers data and analytics teams to manage increasingly large and complex pipelines with improved performance and maintainability.

  • Standardized plugins and extension points enable data platform teams to customize and automate workflows more easily.

➑️ Ready to explore? Join the Gen3 public preview to get early access.

Ascend Gen3 Demo

Systems & architecture​

Explore the foundational improvements in our system's architecture, designed to enhance collaboration, resource management, and cloud efficiency.

  • Version control-first design – Collaborate and track changes with Git-native workflows
  • Project-based organization – Organize and manage resources with intuitive, Project-centric workflows
  • Optimized cloud footprint – Reduce infrastructure usage with centralized UI and a lightweight, scalable backend
  • Event-driven core – Trigger custom workflows using system-generated events
  • Native Git integration – Automate CI/CD pipelines with built-in support for your Git provider CI/CD

Project & resource management​

Effortlessly create, share, configure, and deploy Projects with streamlined processes, allowing you to spend less time on administration and more on engineering innovation.

  • Project structure – Organize and manage your data Projects with improved structure and clarity
  • Environments – Configure and maintain development, staging, and production environments with software development best practices
  • Parameterized everything – Reuse and adapt pipelines with flexible, comprehensive parameterization
  • Deployments – Roll out pipelines consistently across environments with simplified deployment workflows Deployments

Industry-leading security​

Protect your data and resources with enterprise-grade security features, ensuring comprehensive access control and secrets management across your organization.

  • Enterprise-grade authentication – Secure Instances, Projects, and pipelines with OpenID Connect (OIDC)
  • Centralized vault system – Manage secrets, credentials, and sensitive configurations securely across your entire platform

Builder experience​

Discover how our builder experience enhancements simplify Component creation and improve user interaction with a modern interface.

  • Simplified Component spec - Write Components with less boilerplate and more intuitive syntax
  • Components:
    • Partition strategies - Flexible data partitioning for optimal performance
    • Data & job deduplication - Intelligent handling of duplicate data and operations
    • Incremental Components - Process only new or changed data efficiently
    • Views - Create and manage virtual tables efficiently
    • Generic tasks - Support for versatile task types, including SQL and Python scripts for complex operations
  • Data Applications - Build complex data transformations from simpler reusable building blocks and templates
  • Testing support - Test Components easily with built-in sample datasets
  • Modern interface - Navigate an intuitive UI designed for improved productivity
  • Dark mode - Switch between light and dark themes with enhanced visual comfort and accessibility
  • Navigation - Access Projects, Components, and resources through streamlined menus

Navigation and building experience

Data integration​

Our data integration improvements ensure seamless connectivity and performance across major platforms, enhancing your data processing capabilities.

Performance improvements across all data planes**:​

  • 70% reduction in infrastructure costs*
  • 4x faster ingestion speed*
  • 2x faster runtime execution*
  • 10x increase in concurrent processing*

Data Planes: Enhanced connectivity and performance across major platforms.​

Snowflake - Full platform integration including:

  • SQL support
  • Snowpark for advanced data processing
  • Complete access to Snowflake Cortex capabilities BigQuery - Comprehensive SQL support including:
  • BigQuery SQL integration
  • Built-in support for BigQuery AI features

Databricks - Complete lakehouse integration featuring:

  • SQL and PySpark support
  • Full access to AI/ML models in data pipelines
  • Support for both SQL warehouses and clusters
  • Unified compute management

* Performance metrics based on comparative analysis between Gen2 and Gen3 platforms

Data quality​

Enhance your data quality management with automated checks and customizable validation rules, ensuring data integrity across your Projects.

  • Automated quality gates - Validate data within Components, including Read and Write Components
  • Reusable rule library - Create and share standardized data quality rules across your organization
  • Python-based validation - Write custom data quality checks using familiar Python syntax

Flow management​

Optimize your data Flows with advanced planning and execution capabilities, supporting high-frequency and concurrent processing.

  • Gen3 Flow planner & optimizer – Improve pipeline performance with intelligence planning and execution
  • Flow runs – Manage and monitor individual pipeline executions with enhanced controls
  • Concurrent & high-frequency Flow runs – Execute Flows in parallel and at higher frequencies
  • Semantic partition protection – Preserve computed results across code changes to avoid unnecessary reprocessing
  • Optional Smart backfills – Backfill data flexibly with advanced control over reprocessing. Smart Backfills

Automation​

Leverage our Automation features to create dynamic workflows triggered by real-time events, enhancing operational efficiency.

  • Event-driven extensibility - Automate workflows dynamically based on real-time platform events and triggers
  • Customizable event triggers - Create custom Automation triggers including sensors and events Automation

Observability​

Gain comprehensive insights into your data operations with real-time and historical observability, ensuring full transparency and control.

  • Unified metadata stream & repository - Centralize and track metadata across all pipelines
  • Real-time & historical monitoring - Access metadata on pipeline runs and performance history, including:
    • Live monitoring of active pipeline runs
    • Full execution history with smart commit summaries
    • Performance analytics and trend analysis
    • Complete troubleshooting visibility Observability

AI-powered assistant (Otto 🐐)​

Experience the power of AI with Otto, our assistant that helps you create, optimize, and document your data pipelines effortlessly.

  • Component creation & editing - Generate new pipeline Components or modify existing ones with natural language
  • Smart updates & recommendations - Receive intelligent suggestions for pipeline optimization, performance improvements, and descriptive commit messages
  • Automated documentation - Automatically generate and maintain comprehensive documentation for pipelines and Components Otto AI