Attribute Types

Overview of the three attribute types—Discovered (from scans), Schema (from samples), and Virtual (user-created)—and when to use each.

Attribute Types

DataPancake categorizes attributes into three types based on their origin:


Discovered Attributes

Definition: Attributes automatically discovered during the scanning process.

Characteristics:

  • Created when ATTRIBUTE_CREATE_TYPE is set to 'Discover' in the scan configuration

  • Represent actual fields found in your source data

  • Cannot be deleted (only marked inactive)

  • Form the foundation of your schema discovery

  • All 7 polymorphic versions are proactively created when first discovered

Use Cases:

  • Standard schema discovery workflows

  • When you want to discover all attributes from actual data

Creation Process:

1

Created during scan process

2

Attribute record created in datasource_attribute table

3

All 7 polymorphic versions proactively created in datasource_attribute_polymorphic_version table

4

Only the version(s) matching the discovered data type are activated (status = 'active')

5

Non-matching versions are created but remain inactive (status = 'inactive')

6

Initial status dates set for activated versions


Schema Attributes

Definition: Attributes created from a schema sample rather than full data scanning.

Characteristics:

  • Created when ATTRIBUTE_CREATE_TYPE is set to 'Schema' in the scan configuration

  • Based on a schema sample provided during data source creation

  • Useful for rapid prototyping without scanning full datasets

  • Can be updated when full scans are performed

  • All 7 polymorphic versions are proactively created when first created

Use Cases:

  • Quick onboarding with schema samples

  • Prototyping pipelines before full data is available

  • Testing configurations with sample schemas

Creation Process:

1

Similar to Discovered attributes, but based on schema sample rather than full data scan

2

All 7 polymorphic versions are created upfront

3

Versions matching the schema sample are activated


Virtual Attributes

Definition: User-created custom attributes that don't exist in the source data.

Characteristics:

  • Created manually by users through the UI or API

  • Defined with transformation expressions

  • Used for derived fields, calculated metrics, and semantic model components

  • Can be used for filters, metrics, facts, and dimensions in Cortex Analyst semantic models

  • Only one polymorphic version created (no polymorphism for virtual attributes)

  • Automatically set to INCLUDE_IN_CODE_GEN = TRUE

Use Cases:

  • Derived fields - Calculate values from existing attributes (e.g., full_name = first_name || ' ' || last_name)

  • Semantic model metrics - Create aggregations for Cortex Analyst (e.g., total_revenue = SUM(order_total))

  • Semantic model filters - Define filterable dimensions

  • Business logic - Add computed columns not in source data

Creating Virtual Attributes:

  • Must specify: attribute name, source data type, Snowflake data type

  • Can optionally specify: parent array, transformation expression, W_QUESTION_CATEGORY (for semantic models)

  • Created via UI or API (sp_upsert_virtual_datasource_attribute)

Creation Process:

1

Created via UI or API (sp_upsert_virtual_datasource_attribute)

2

User specifies name, types, and transformation expression

3

Single polymorphic version created (no polymorphism for virtual attributes)

4

Automatically set to INCLUDE_IN_CODE_GEN = TRUE


Comparison

Feature
Discovered
Schema
Virtual

Source

Full data scan

Schema sample

User-created

Polymorphic Versions

All 7 created

All 7 created

Single version

Can Delete

No (mark inactive)

No (mark inactive)

Yes

Use Case

Production discovery

Rapid prototyping

Custom logic

Update Method

Re-scan

Re-scan or update

UI/API update

Last updated

Was this helpful?