Data Source Types

Classification and format configuration for data sources in DataPancake.

Overview

DataPancake supports two primary data source types: Semi-Structured and Structured. The data source type determines what configuration options are available, how data is processed, and what materialization capabilities can be used. Semi-structured data sources require additional format and column type specifications, while structured data sources connect directly to relational database objects.


Semi-Structured Data Sources

Semi-structured data sources connect to Snowflake objects containing JSON, Avro, Parquet, ORC, or XML data stored in VARIANT or String columns. These data sources enable DataPancake's full schema discovery and materialization capabilities.

Supported Object Types

Semi-structured data sources can be created from:

  • Dynamic Tables - Snowflake dynamic tables containing VARIANT columns only

  • External Tables - External tables pointing to cloud storage (S3, Azure Blob, etc.). JSON format only.

  • Iceberg Tables - Iceberg tables with VARIANT or STRING columns. JSON format only. STRING type is primarily for Iceberg use cases and supports JSON only (not XML).

  • Materialized Views - Materialized views containing VARIANT columns

  • Tables - Standard Snowflake tables with VARIANT columns

  • Views - Views that expose VARIANT columns

Format Types

  • JSON

  • Avro

  • Parquet

  • ORC

  • XML

Column Data Types

  • VARIANT - Native Snowflake semi-structured data type

  • String - Text column containing JSON strings. For Iceberg tables, STRING type supports JSON only (not XML).


Structured Data Sources

Structured data sources connect to standard relational database objects (tables, views, materialized views) without requiring format or column type specifications. These data sources provide basic schema discovery capabilities.

Supported Object Types

  • Materialized Views - Materialized views

  • Tables - Standard Snowflake tables

  • Views - Standard views

Limitations

Structured data sources only support:

  • Data Dictionary Builder

  • Semantic Model Generator

These are the only two supported features for structured data sources.


For information on adding data sources, see Adding Data Sources. For source object configuration, see Source Object Settings.

Last updated

Was this helpful?