Skip to main content
Version: 3.0.0

Python Transform input formats

Data for Python Transform Components can be provided in any of the following formats:

  • Ibis table (default) - A portable Python dataframe library that provides a consistent API across different backends
  • pandas DataFrame - The popular Python data analysis and manipulation library with DataFrame structures
  • Python Dictionary (input_data_format="dict") - Standard Python dictionary format for structured data
  • PyArrow (input_data_format="pyarrow") - Apache Arrow's Python library for columnar in-memory analytics
  • DuckDB PyRelation - DuckDB's native Python relation object for efficient in-memory analytical processing

To specify a format other than the default Ibis, use the input_data_format parameter with the @transform decorator. For example: input_data_format="pandas".

These formats provide flexibility in how you work with data in your Python Components, allowing you to choose the library that best fits your use case and performance requirements.

Specialized Components
  • Snowpark components require a Snowflake Data Plane and use the snowflake.snowpark.DataFrame input type
  • PySpark components require a Databricks Data Plane and use the pyspark.sql.DataFrame input type

Next steps​

Ready to build your Python Transform Components? Check out these how-to guides: