Attribute Metadata Details
Complete reference for attribute metadata, covering discovered source schema fields and configurable extended metadata that controls SQL transformation, security policies, and code generation.
Overview
Attributes contain extensive metadata that controls how they're transformed into SQL. This metadata is stored in two main tables:
core.datasource_attribute- Core attribute informationcore.datasource_attribute_polymorphic_version- Version-specific metadata
DataPancake discovers the source schema attributes and generates metadata based on those attributes, then creates additional extended metadata to allow you to provide additional configuration information for flattening and normalization, security policy integration, schema transformation, and data enrichment.
Source Schema vs. Extended Metadata
Source Schema Metadata:
Automatically discovered during scanning
Represents what DataPancake found in your data
Includes paths, data types, sample values, and structure information
Generally read-only (except for record status)
Extended Metadata:
User-configurable fields for transformation
Controls how attributes become SQL columns
Includes data type mappings, transformations, security policies, and semantic layer configurations
Fully editable to customize pipeline behavior
Metadata Organization
The Pipeline Designer organizes attribute metadata into logical sections:
Source Schema Metadata - Discovered attributes, paths, types, and structure information
Destination Schema Metadata - Snowflake data type configuration and key identification
Transformation & Security Metadata - Column naming, transformations, semantic layer, and security policies
Arrays & Nested Structures Metadata - Array-specific configuration and relationships
Foreign Key Metadata - Foreign key relationships for array tables
Core Attribute Fields
These fields are stored in core.datasource_attribute and apply to all polymorphic versions of an attribute:
Attribute Path (ATTRIBUTE_PATH)
The full path to the attribute in the source data
Example:
customer.contact.emailUsed to reference the attribute in source queries
Attribute Path Embedded (ATTRIBUTE_PATH_EMBEDDED)
Path for attributes found within embedded/stringified JSON
Tracks nested JSON structures within string fields
Attribute Name (ATTRIBUTE_NAME)
The leaf name of the attribute
Example: For path
customer.contact.email, the name isemail
Attribute Level (ATTRIBUTE_LEVEL)
The nesting depth of the attribute
Root level attributes have level 0
Each nested level increments the count
Attribute Order (ATTRIBUTE_ORDER)
Ordering information for attributes at the same level
Used for consistent presentation
Parent Object (PARENT_OBJECT)
The parent object path containing this attribute
Example: For
customer.contact.email, parent object iscustomer.contact
Parent Array (PARENT_ARRAY)
The parent array path if this attribute is within an array
Example: For
orders[0].order_id, parent array isordersUsed for foreign key relationship configuration
Parent Array Embedded (PARENT_ARRAY_EMBEDDED)
Parent array path for attributes within embedded JSON arrays
Record Status (RECORD_STATUS)
'active'- Attribute is active and can be used'inactive'- Attribute is deactivated
Attribute Type (ATTRIBUTE_TYPE)
'Discovered'- Found during scanning'Schema'- Created from schema sample'Virtual'- User-created custom attribute
Attribute Schema Component Type (ATTRIBUTE_SCHEMA_COMPONENT_TYPE)
Classifies the schema component type
Used for internal organization
Is Primary Key (IS_PRIMARY_KEY)
Boolean indicating if this attribute is part of the primary key
Used for metadata table generation and relationship tracking
Date Active (VERSION_STATUS_DATE)
Timestamp when the attribute was first discovered or created
Used for tracking attribute lifecycle
Polymorphic Version Fields
These fields are stored in core.datasource_attribute_polymorphic_version and are specific to each polymorphic version:
Type Information:
SOURCE_DATA_TYPE- The inferred data type from source dataPOLYMORPHIC_ATTRIBUTE_NAME- Unique name for this versionVERSION_STATUS- Active or inactive statusIS_ARRAY- Boolean indicating if this version is an array typeARRAY_TYPE- Object, primitive, or polymorphic array typeARRAY_PRIMITIVE_TYPE- Data type of primitive array elementsHAS_EMBEDDED_CONTENT- Boolean for embedded/stringified JSONSAMPLE_VALUE- Sample value from source data
Snowflake Data Type Configuration:
DATA_PLATFORM_DATA_TYPE- Snowflake data typeDATA_PLATFORM_DATA_TYPE_PRECISION- Precision for numeric typesDATA_PLATFORM_DATA_TYPE_SCALE- Scale for numeric typesDATA_PLATFORM_DATA_TYPE_DATETIMEFORMAT- DateTime format stringsUSE_DATETIME_FORMAT- Boolean for using datetime formats
Transformation & Naming:
ALIAS_NAME- Custom column name aliasCODE_GENERATED_COLUMN_NAME- Final column name in generated SQLNULL_VALUE_EXPRESSION- SQL expression for null handlingTRANSFORMATION_TYPE- No Transformation or SQL ExpressionTRANSFORMATION_EXPRESSION- Custom SQL transformationTRANSFORMATION_EXPRESSION_COMMENT- Documentation for transformation
Semantic Layer:
INCLUDE_IN_SEMANTIC_LAYER- Boolean for including in viewsSEMANTIC_LAYER_EXPRESSION- Additional transformation for viewsSEMANTIC_LAYER_EXPRESSION_COMMENT- Documentation for view transformationSEMANTIC_LAYER_ALIAS_NAME- Custom alias for views
Security:
INCLUDE_IN_SECURITY_ROW_LEVEL_POLICY- Boolean for row-level securityMASKING_POLICY_NAME- Name of masking policyMASKING_POLICY_PARAMETERS- Parameters for masking policy
Schema Consolidation:
SCHEMA_INSERT_REGULAR_EXPRESSION_SEARCH- Regex pattern for consolidationSCHEMA_INSERT_SQL_EXPRESSION- SQL expression for consolidation
Code Generation:
INCLUDE_IN_CODE_GEN- Boolean for including in code generationPARENT_INCLUDE_IN_CODE_GEN- Boolean for parent inclusion
Additional Metadata:
IS_ENUM- Boolean for enum valuesHAS_ALL_UNIQUE_VALUES- Boolean for unique valuesHAS_ALL_NULL_VALUES- Boolean for null valuesADD_TO_CLUSTER_BY- Boolean for CLUSTER BY clause
Metadata Workflow
Discovery - Source schema metadata is automatically discovered during scanning
Configuration - Extended metadata is configured in the Pipeline Designer
Code Generation - Metadata drives SQL code generation for Dynamic Tables and Views
Materialization - Generated SQL creates materialized tables with configured transformations
Semantic Layer - Additional transformations applied in views for analytics
Quick Reference
Editable in Source Schema Tab:
Attribute Record Status
Editable in Destination Schema Tab:
Snowflake Data Type
Precision and Scale
DateTime Format
Use DateTime Format
Primary Key
Contains Enum Values
Unique Values
Editable in Transformation/Security Tab:
Materialized Alias
Null Value Expression
Transformation Type and Expression
Schema Consolidation fields
Include in View
View Transformation Expression
View Alias
Row Access Policy
Masking Policy
Editable in Arrays Section:
Include in Code Gen
Materialized Table Alias Name
View Alias Name
Relationship Name, Description, Type, Join Type
View Row Access Policy
Editable in Foreign Keys Section:
Parent Attribute Name
Foreign Key Column Alias
Add to Row Access Policy
Last updated
Was this helpful?