Attribute Types

Overview of the three attribute types—Discovered (from scans), Schema (from samples), and Virtual (user-created)—and when to use each.

Attribute Types

DataPancake categorizes attributes into three types based on their origin:

Discovered Attributes

Definition: Attributes automatically discovered during the scanning process.

Characteristics:

Created when ATTRIBUTE_CREATE_TYPE is set to 'Discover' in the scan configuration
Represent actual fields found in your source data
Cannot be deleted (only marked inactive)
Form the foundation of your schema discovery
All 7 polymorphic versions are proactively created when first discovered

Use Cases:

Standard schema discovery workflows
When you want to discover all attributes from actual data

Creation Process:

Created during scan process

Attribute record created in datasource_attribute table

All 7 polymorphic versions proactively created in datasource_attribute_polymorphic_version table

Only the version(s) matching the discovered data type are activated (status = 'active')

Non-matching versions are created but remain inactive (status = 'inactive')

Initial status dates set for activated versions

Schema Attributes

Definition: Attributes created from a schema sample rather than full data scanning.

Characteristics:

Created when ATTRIBUTE_CREATE_TYPE is set to 'Schema' in the scan configuration
Based on a schema sample provided during data source creation
Useful for rapid prototyping without scanning full datasets
Can be updated when full scans are performed
All 7 polymorphic versions are proactively created when first created

Use Cases:

Quick onboarding with schema samples
Prototyping pipelines before full data is available
Testing configurations with sample schemas

Creation Process:

Similar to Discovered attributes, but based on schema sample rather than full data scan

All 7 polymorphic versions are created upfront

Versions matching the schema sample are activated

Virtual Attributes

Definition: User-created custom attributes that don't exist in the source data.

Characteristics:

Created manually by users through the UI or API
Defined with transformation expressions
Used for derived fields, calculated metrics, and semantic model components
Can be used for filters, metrics, facts, and dimensions in Cortex Analyst semantic models
Only one polymorphic version created (no polymorphism for virtual attributes)
Automatically set to INCLUDE_IN_CODE_GEN = TRUE

Use Cases:

Derived fields - Calculate values from existing attributes (e.g., full_name = first_name || ' ' || last_name)
Semantic model metrics - Create aggregations for Cortex Analyst (e.g., total_revenue = SUM(order_total))
Semantic model filters - Define filterable dimensions
Business logic - Add computed columns not in source data

Creating Virtual Attributes:

Must specify: attribute name, source data type, Snowflake data type
Can optionally specify: parent array, transformation expression, W_QUESTION_CATEGORY (for semantic models)
Created via UI or API (sp_upsert_virtual_datasource_attribute)

Creation Process:

Created via UI or API (sp_upsert_virtual_datasource_attribute)

User specifies name, types, and transformation expression

Single polymorphic version created (no polymorphism for virtual attributes)

Automatically set to INCLUDE_IN_CODE_GEN = TRUE

Comparison

Feature

Discovered

Schema

Virtual

Source

Full data scan

Schema sample

User-created

Polymorphic Versions

All 7 created

Single version

Can Delete

No (mark inactive)

Yes

Use Case

Production discovery

Rapid prototyping

Custom logic

Update Method

Re-scan

Re-scan or update

UI/API update

PreviousAttribute Discovery Process NextPolymorphic Versions

Last updated 11 hours ago

Was this helpful?

Good night

Attribute Types

Discovered Attributes

Schema Attributes

Virtual Attributes

Comparison