Polymorphic Versions

How DataPancake proactively creates all 7 polymorphic versions for every attribute upfront, then activates only the versions that match discovered data types.

What Are Polymorphic Versions?

Polymorphic versions represent different data type variations of the same attribute path. DataPancake creates all 7 polymorphic versions for every attribute when first discovered, regardless of the discovered type.


The 7 Polymorphic Versions

DataPancake creates these 7 versions for every attribute:

  • str — String values

  • int — Integer values

  • float — Floating-point numbers

  • bool — Boolean values

  • object — Nested objects

  • array_primitive — Arrays of primitives (strings, numbers, booleans)

  • array_object — Arrays of objects



Proactive Version Creation

When an attribute is first discovered:

  1. All 7 polymorphic version records are created in core.datasource_attribute_polymorphic_version

  2. Each version is created with VERSION_STATUS = 'inactive'

  3. Only versions matching the discovered data type are set to VERSION_STATUS = 'active'

  4. Inactive versions remain ready for activation in future scans

Why: Enables seamless schema evolution—when a new data type appears, the version already exists and can be activated without creating new records.


Naming Convention

Polymorphic versions are named in POLYMORPHIC_ATTRIBUTE_NAME:

  • Primitive types: {attribute_name}_{type} (e.g., status_str, price_int, rating_float, is_active_bool)

  • Object: {attribute_name}_object (e.g., address_object)

  • Arrays: {attribute_name}_array_{array_type} (e.g., tags_array_primitive, orders_array_object)


Example: Polymorphic Version Creation

When DataPancake first discovers an attribute path property_type as a string:

What actually happens:

1

Initial discovery

  • DataPancake creates the attribute record for property_type.

  • Proactively creates all 7 polymorphic versions:

    • property_type_stractivated (matches discovered type)

    • property_type_int → inactive

    • property_type_float → inactive

    • property_type_bool → inactive

    • property_type_object → inactive

    • property_type_array_primitive → inactive

    • property_type_array_object → inactive

2

Later: data becomes polymorphic

If a new record contains property_type as an object:

  • DataPancake detects the new data type during scanning.

  • Activates the existing property_type_object version (it was already created).

  • Updates the version status from 'inactive' to 'active' and sets the version status date.

Result: Both versions are now active:

  • property_type_str (active)

  • property_type_object (active)


Version Status

VERSION_STATUS values:

  • 'active' — Data type found in data; included in code generation

  • 'inactive' — Version exists but not yet found; ready for activation

Status transitions:

  • Activated when corresponding data type is discovered in scans

  • May become inactive if data type disappears (schema evolution)

  • Only active versions are included in SQL code generation


Why Proactive Creation?

Semi-structured data often has polymorphic paths (same path, different types across records). By creating all 7 versions upfront:

  • Schema evolution is seamless—no new records needed, just activation

  • Consistent metadata structure across all attributes

  • Zero technical debt from reactive version creation


Managing Versions

Review versions:

  • Check all 7 versions for each attribute in the UI

  • Monitor VERSION_STATUS_DATE to track schema evolution

  • Use INCLUDE_IN_CODE_GEN to control which active versions generate columns

Handling polymorphism:

  • Use transformation expressions to handle type variations

  • Set INCLUDE_IN_CODE_GEN = FALSE for versions you don't need

  • Monitor newly activated versions after scans

Last updated

Was this helpful?