Attribute Metadata Types

Overview of the three attribute types and where they come from: Discovered (from scans), Schema (from samples), and Virtual (user-created).

DataPancake categorizes attributes into three types based on their origin:


Discovered Attributes

Definition: Attributes automatically discovered during the scanning process.

Characteristics:

  • Created when ATTRIBUTE_CREATE_TYPE = 'Discover' in scan configuration

  • Represents actual fields found in source data

  • Cannot be deleted (only RECORD_STATUS = 'inactive')

  • All 7 polymorphic versions created proactively

Creation: During scan process via sp_upsert_attribute. All 7 polymorphic versions created in core.datasource_attribute_polymorphic_version; only matching versions set to VERSION_STATUS = 'active'.


Schema Attributes

Definition: Attributes created from a schema sample rather than full data scanning.

Characteristics:

  • Created when ATTRIBUTE_CREATE_TYPE = 'Schema' in scan configuration

  • Based on schema sample from DATASOURCE_OBJECT_SCHEMA_SAMPLE field

  • All 7 polymorphic versions created proactively

  • Can be updated when full scans are performed

Use Cases: Rapid prototyping without full data scans; testing configurations with sample schemas.


Virtual Attributes

Definition: User-created custom attributes that don't exist in the source data.

Characteristics:

  • Created via UI or sp_upsert_virtual_datasource_attribute stored procedure

  • Single polymorphic version (no polymorphism)

  • Automatically set to INCLUDE_IN_CODE_GEN = TRUE

  • ATTRIBUTE_TYPE = 'Virtual'

Required fields:

  • Attribute name (no spaces)

  • Source data type

  • Snowflake data type

  • Transformation expression (SQL)

Optional fields:

  • Parent array (for array-level virtual attributes)

  • W_QUESTION_CATEGORY (for Cortex Analyst semantic models)

Use Cases:

  • Derived fields (e.g., full_name = first_name || ' ' || last_name)

  • Semantic model metrics/filters/dimensions

  • Business logic not in source data


Comparison

Feature
Discovered
Schema
Virtual

Source

Full data scan

Schema sample

User-created

ATTRIBUTE_CREATE_TYPE

'Discover'

'Schema'

N/A

Polymorphic Versions

All 7 created

All 7 created

Single version

Can Delete

No (mark inactive)

No (mark inactive)

Yes

Update Method

Re-scan

Re-scan or update

UI/API update

Last updated

Was this helpful?