Attribute Metadata Details

Complete reference for attribute metadata, covering discovered source schema fields and configurable extended metadata that controls SQL transformation, security policies, and code generation.

Overview

Attribute metadata is stored in:

  • core.datasource_attribute - Core attribute information

  • core.datasource_attribute_polymorphic_version - Version-specific metadata

Source Schema Metadata: Automatically discovered during scanning (read-only except RECORD_STATUS).

Extended Metadata: User-configurable fields for transformation, security, and code generation.

Metadata Organization

Pipeline Designer organizes metadata by UI tabs:


Core Attribute Fields

Stored in core.datasource_attribute (applies to all polymorphic versions):

  • ATTRIBUTE_PATH - Full path in source data (e.g., customer.contact.email)

  • ATTRIBUTE_PATH_EMBEDDED - Path for attributes within embedded/stringified JSON

  • ATTRIBUTE_NAME - Leaf name (e.g., email from customer.contact.email)

  • ATTRIBUTE_LEVEL - Nesting depth (0 = root, increments per level)

  • ATTRIBUTE_ORDER - Ordering for attributes at same level

  • PARENT_OBJECT - Parent object path (e.g., customer.contact for customer.contact.email)

  • PARENT_ARRAY - Parent array path if within array (e.g., orders for orders[0].order_id)

  • PARENT_ARRAY_EMBEDDED - Parent array path for embedded JSON arrays

  • RECORD_STATUS - 'active' or 'inactive' (editable)

  • ATTRIBUTE_TYPE - 'Discovered', 'Schema', or 'Virtual'

  • ATTRIBUTE_SCHEMA_COMPONENT_TYPE - Internal classification

  • IS_PRIMARY_KEY - Boolean for primary key identification


Polymorphic Version Fields

Stored in core.datasource_attribute_polymorphic_version (version-specific):

Type Information:

  • SOURCE_DATA_TYPE - Inferred type ('str', 'int', 'float', 'bool', 'object', 'array', 'null')

  • POLYMORPHIC_ATTRIBUTE_NAME - Unique version name (e.g., email_str, orders_array_object)

  • VERSION_STATUS - 'active' or 'inactive'

  • IS_ARRAY - Boolean for array type

  • ARRAY_TYPE - 'object', 'primitive', or 'primitive,object'

  • ARRAY_PRIMITIVE_TYPE - Element type for primitive arrays ('str', 'int', 'float', 'bool')

  • HAS_EMBEDDED_CONTENT - Boolean for embedded/stringified JSON

  • SAMPLE_VALUE - Sample value (strings default to "string value" for privacy)

  • VERSION_STATUS_DATE - Timestamp when version was created/last activated

Snowflake Data Type:

  • DATA_PLATFORM_DATA_TYPE - Snowflake data type (editable)

  • DATA_PLATFORM_DATA_TYPE_PRECISION - Precision for numeric types (editable)

  • DATA_PLATFORM_DATA_TYPE_SCALE - Scale for numeric types (editable)

  • DATA_PLATFORM_DATA_TYPE_DATETIMEFORMAT - DateTime format strings (editable)

  • USE_DATETIME_FORMAT - Boolean for using formats (editable)

Transformation & Naming:

  • ALIAS_NAME - Custom column alias (editable)

  • CODE_GENERATED_COLUMN_NAME - Final column name (read-only, auto-generated)

  • NULL_VALUE_EXPRESSION - SQL for null handling (editable, only when TRANSFORMATION_TYPE = 'No Transformation')

  • TRANSFORMATION_TYPE - 'No Transformation' or 'SQL Expression' (editable)

  • TRANSFORMATION_EXPRESSION - Custom SQL transformation (editable)

  • TRANSFORMATION_EXPRESSION_COMMENT - Documentation (editable)

Semantic Layer:

  • INCLUDE_IN_SEMANTIC_LAYER - Boolean for view inclusion (editable)

  • SEMANTIC_LAYER_EXPRESSION - View-level transformation (editable)

  • SEMANTIC_LAYER_EXPRESSION_COMMENT - Documentation (editable)

  • SEMANTIC_LAYER_ALIAS_NAME - View column alias (editable)

Security:

  • INCLUDE_IN_SECURITY_ROW_LEVEL_POLICY - Boolean for row-level security (editable)

  • MASKING_POLICY_NAME - Masking policy name (editable)

  • MASKING_POLICY_PARAMETERS - Policy parameters (editable)

Schema Consolidation:

  • SCHEMA_INSERT_REGULAR_EXPRESSION_SEARCH - Regex pattern (editable)

  • SCHEMA_INSERT_SQL_EXPRESSION - SQL expression (editable)

Code Generation:

  • INCLUDE_IN_CODE_GEN - Boolean for code generation (editable)

  • PARENT_INCLUDE_IN_CODE_GEN - Boolean for parent inclusion (read-only)

Additional:

  • IS_ENUM - Boolean for enum values (editable)

  • HAS_ALL_UNIQUE_VALUES - Boolean for unique values (editable)

  • HAS_ALL_NULL_VALUES - Boolean for null values (read-only)

  • ADD_TO_CLUSTER_BY - Boolean for CLUSTER BY clause (editable)


Metadata Workflow

  1. Discovery - Source schema metadata discovered during scanning

  2. Configuration - Extended metadata configured in Pipeline Designer

  3. Code Generation - Metadata drives SQL generation for Dynamic Tables and Views

  4. Materialization - Generated SQL creates materialized tables with transformations

  5. Semantic Layer - Additional transformations applied in views

Quick Reference

See individual pages for complete field details:

Last updated

Was this helpful?