DVRF™: Moving Beyond the Medallion Architecture in the Age of GenAI -

Best Practices

Configuration recommendations for attributes, including data type selection, transformation expressions, and performance optimization.

Attribute Configuration

After scanning:

Review discovered attributes and verify inferred DATA_PLATFORM_DATA_TYPE
Check for unexpected polymorphic version activations
Review HAS_EMBEDDED_CONTENT = TRUE attributes (stringified JSON)

Data type configuration:

Adjust DATA_PLATFORM_DATA_TYPE_PRECISION and DATA_PLATFORM_DATA_TYPE_SCALE for numeric types
Set DATA_PLATFORM_DATA_TYPE_DATETIMEFORMAT when USE_DATETIME_FORMAT = TRUE for accurate parsing
Use ALIAS_NAME for readable column names (overrides POLYMORPHIC_ATTRIBUTE_NAME)

Null handling:

Set NULL_VALUE_EXPRESSION (only applies when TRANSFORMATION_TYPE = 'No Transformation')
Use {attribute_name} placeholder in expressions

Transformations:

Use TRANSFORMATION_EXPRESSION when TRANSFORMATION_TYPE = 'SQL Expression'
Document in TRANSFORMATION_EXPRESSION_COMMENT

Virtual Attributes

Naming: No spaces allowed in attribute names. Names appear in generated SQL as CODE_GENERATED_COLUMN_NAME.

Transformation expressions:

Required field; must be valid SQL
Can reference other attributes using {attribute_name} placeholder
Automatically set to INCLUDE_IN_CODE_GEN = TRUE

Semantic model integration:

Set W_QUESTION_CATEGORY for Cortex Analyst (values: 'dimensions', 'facts', 'filters', 'metrics', 'time_dimensions')
Link to parent arrays when creating array-level virtual attributes

Polymorphic Versions

Review versions:

Check all 7 versions for each attribute
Monitor VERSION_STATUS_DATE to track schema evolution
Set INCLUDE_IN_CODE_GEN = FALSE for versions you don't need

Handling polymorphism:

Use transformation expressions to handle type variations
Consider data quality implications when multiple versions are active
Monitor newly activated versions after scans

Version management:

Inactive versions are ready for activation (don't delete)
Only active versions with INCLUDE_IN_CODE_GEN = TRUE generate columns

Performance

Code generation:

Set INCLUDE_IN_CODE_GEN = FALSE for unused attributes
Reduces Dynamic Table size and refresh time
Child attributes only generated if PARENT_INCLUDE_IN_CODE_GEN = TRUE

Clustering:

Set ADD_TO_CLUSTER_BY = TRUE for frequently queried attributes
Improves query performance on large Dynamic Tables
Balance with refresh performance impact

Data type sizing:

Set appropriate DATA_PLATFORM_DATA_TYPE_PRECISION and DATA_PLATFORM_DATA_TYPE_SCALE
Avoid over-sizing numeric types (wastes storage)

PreviousAttributes in Code Generation NextIntegration & API

Last updated 2 months ago

Was this helpful?