Polymorphic Versions
How DataPancake proactively creates all 7 polymorphic versions for every attribute upfront, then activates only the versions that match discovered data types.
What Are Polymorphic Versions?
Polymorphic versions represent different data type variations of the same attribute path. DataPancake proactively accounts for all possible polymorphic variations by creating all 7 possible polymorphic versions for every attribute when it's first discovered, regardless of what type was actually found in the data.
The 7 Polymorphic Variations
DataPancake recognizes that any attribute path can potentially appear in 7 different ways across different records:
str(String) — Text valuesint(Integer) — Whole numbersfloat(Decimal) — Floating-point numbersbool(Boolean) — True/false valuesobject— Nested objects/structuresarray_primitive— Arrays containing primitive values (strings, numbers, booleans)array_object— Arrays containing objects/structures
Proactive Version Creation
How it works:
When an attribute is first discovered
DataPancake immediately creates all 7 polymorphic version records for that attribute.
Each version is initially created with
status = 'inactive'.Only the version(s) that match the actually discovered data type are activated (
status = 'active').Inactive versions remain in the system, ready to be activated if that data type appears in future scans.
Rationale
Future-proofing: If an attribute becomes polymorphic later (e.g.,
statuschanges from string to number), the version already exists and can be activated.Consistent structure: All attributes have the same potential polymorphic structure, making the system predictable.
Schema drift handling: When schema changes occur, new polymorphic versions can be activated without structural changes.
Zero technical debt: The system is always ready for any polymorphic variation.
How Polymorphic Versions Are Named
Each polymorphic version gets a unique name based on its data type:
Primitive types:
{attribute_name}_{type}Examples:
status_string,price_int,rating_float,is_active_bool
Object type:
{attribute_name}_objectExample:
address_object
Array types:
{attribute_name}_array_{array_type}Examples:
tags_array_primitive,orders_array_object
Example: Polymorphic Version Creation
When DataPancake first discovers an attribute path property_type as a string:
What actually happens:
Initial discovery
DataPancake creates the attribute record for
property_type.Proactively creates all 7 polymorphic versions:
property_type_string→ activated (matches discovered type)property_type_int→ inactiveproperty_type_float→ inactiveproperty_type_bool→ inactiveproperty_type_object→ inactiveproperty_type_array_primitive→ inactiveproperty_type_array_object→ inactive
Later: data becomes polymorphic
If a new record contains property_type as an object:
DataPancake detects the new data type during scanning.
Activates the existing
property_type_objectversion (it was already created).Updates the version status from
'inactive'to'active'and sets the version status date.
Result: Both versions are now active:
property_type_string(active)property_type_object(active)
Version Status
Each polymorphic version has a status:
Active — This data type has been found in the data and is included in code generation
Inactive — This data type hasn't been found yet, but the version exists and is ready to be activated
Status management:
Versions are activated when their corresponding data type is discovered.
Versions can become inactive if that data type disappears from the data (schema evolution).
Only active versions are included in SQL code generation.
Inactive versions remain in the system for potential future activation.
Why Polymorphic Versions Exist
Semi-structured data is flexible, and the same path can contain:
Different primitive types —
statusmight be a string in some records and a number in othersObjects vs. primitives —
addressmight be a string in some records and an object in othersArrays vs. primitives —
tagsmight be a string in some records and an array in othersDifferent array types — An array might contain objects in some cases and primitives in others
By proactively creating all possible versions, DataPancake ensures:
No surprises: The system is always ready for any polymorphic variation
Smooth schema evolution: New variations can be activated without structural changes
Consistent metadata: All attributes follow the same polymorphic structure
Managing Polymorphic Versions
Reviewing Versions
Handling Polymorphism
Use transformation expressions to handle type variations.
Consider data quality implications.
Document polymorphic behavior.
Version Management Best Practices
Keep active only the versions you need.
Monitor version status dates for schema changes.
Review inactive versions periodically.
Understand that inactive versions are ready for future activation.
Last updated
Was this helpful?