Source Schema
Read-only metadata discovered during scanning, including attribute paths, nesting levels, data types, structure information, and sample values representing what DataPancake found in your source data.
Overview
Source Schema metadata represents what DataPancake discovered during scanning. Primarily read-only (except RECORD_STATUS). Provides structure, types, and content information.
Source Schema Fields
Path Information
Attribute Path (ATTRIBUTE_PATH)
The full path to the attribute in the source data
Example:
customer.contact.emailUsed to reference the attribute in source queries
Read-only - Set during discovery
Attribute Path Embedded (ATTRIBUTE_PATH_EMBEDDED)
Path for attributes found within embedded/stringified JSON
Tracks nested JSON structures within string fields
Example: For JSON stored as a string, this tracks the path within that JSON
Read-only - Set during discovery
Attribute Name (ATTRIBUTE_NAME)
The leaf name of the attribute
Example: For path
customer.contact.email, the name isemailRead-only - Extracted from attribute path
Structure Information
Attribute Level (ATTRIBUTE_LEVEL)
The nesting depth of the attribute
Root level attributes have level 0
Each nested level increments the count
Example:
customer= level 0,customer.contact= level 1,customer.contact.email= level 2Read-only - Calculated during discovery
Attribute Order (ATTRIBUTE_ORDER)
Ordering information for attributes at the same level
Used for consistent presentation in the UI
Read-only - Set during discovery
Parent Object (PARENT_OBJECT)
The parent object path containing this attribute
Example: For
customer.contact.email, parent object iscustomer.contactEmpty for root-level attributes
Read-only - Set during discovery
Parent Array (PARENT_ARRAY)
The parent array path if this attribute is within an array
Example: For
orders[0].order_id, parent array isordersUsed for foreign key relationship configuration
Empty for non-array attributes
Read-only - Set during discovery
Parent Array Embedded (PARENT_ARRAY_EMBEDDED)
Parent array path for attributes within embedded JSON arrays
Tracks arrays within stringified JSON
Read-only - Set during discovery
Type Information
Source Data Type (SOURCE_DATA_TYPE)
Inferred data type from source data
Values:
'str','int','float','bool','object','array','null'Read-only - Inferred during scanning
Polymorphic Attribute Name (POLYMORPHIC_ATTRIBUTE_NAME)
Unique name for this polymorphic version
Format:
{attribute_name}_{type}or{attribute_name}_array_{array_type}Examples:
email_str,price_float,orders_array_objectRead-only - Generated based on source data type
Array Type (ARRAY_TYPE)
For array attributes, the type of array elements
Values:
'object','primitive','primitive,object'Only applicable when
SOURCE_DATA_TYPE = 'array'Read-only - Determined during discovery
Array Primitive Type (ARRAY_PRIMITIVE_TYPE)
For primitive arrays, the data type of array elements
Values:
'str','int','float','bool'Only applicable when
ARRAY_TYPE = 'primitive'orARRAY_TYPE = 'primitive,object'Read-only - Determined during discovery
Content Information
Sample Value (SAMPLE_VALUE)
A sample value from the source data
For string types, defaults to
"string value"for privacy/securityHelps users understand the data content
Read-only - Captured during scanning
Has Embedded Content (HAS_EMBEDDED_CONTENT)
Boolean indicating if this attribute contains embedded/stringified JSON
Triggers recursive parsing of the embedded content
When
TRUE, DataPancake recursively parses the JSON string to discover nested attributesRead-only - Detected during scanning
Attribute Classification
Attribute Type (ATTRIBUTE_TYPE)
The origin of the attribute
Values:
'Discovered'- Found during scanning'Schema'- Created from schema sample'Virtual'- User-created custom attribute
Read-only - Set based on how attribute was created
Attribute Schema Component Type (ATTRIBUTE_SCHEMA_COMPONENT_TYPE)
Classifies the schema component type
Used for internal organization and categorization
Read-only - Set during discovery
Version Status Date (VERSION_STATUS_DATE)
Timestamp when polymorphic version was created or last activated
Used for tracking schema evolution
Read-only - Set when version is created/activated
Status Control
Attribute Record Status (RECORD_STATUS)
Controls whether attribute is active or inactive
Values:
'active','inactive'Editable - Only editable field in Source Schema tab
Active attributes included in code generation (if
INCLUDE_IN_CODE_GEN = TRUE)Inactive attributes excluded from code generation
Virtual Attributes
User-created custom attributes (created via UI or sp_upsert_virtual_datasource_attribute).
Required fields:
Attribute Name (no spaces)
Source Data Type
Snowflake Data Type
Transformation Expression (SQL)
Optional fields:
Parent Array (for array-level virtual attributes)
W_QUESTION_CATEGORY(for Cortex Analyst semantic models)Description
Characteristics:
Single polymorphic version (no polymorphism)
Automatically set to
INCLUDE_IN_CODE_GEN = TRUEATTRIBUTE_TYPE = 'Virtual'Can reference other attributes using
{attribute_name}placeholder in expressions
Common Scenarios
Understanding nested structure:
Use
ATTRIBUTE_LEVEL(0 = root, increments per level) andPARENT_OBJECTto understand nesting
Identifying embedded JSON:
Check
HAS_EMBEDDED_CONTENT = TRUEfor stringified JSONATTRIBUTE_PATH_EMBEDDEDshows nested structure within string fields
Working with arrays:
ARRAY_TYPEindicates'object','primitive', or'primitive,object'ARRAY_PRIMITIVE_TYPE(if applicable) shows element typePARENT_ARRAYshows containing array for nested attributes
Deactivating attributes:
Set
RECORD_STATUS = 'inactive'to exclude from code generation without deletingUseful for temporarily excluding, preserving history, or testing
Last updated
Was this helpful?