# Polymorphic Versions

## What Are Polymorphic Versions?

Polymorphic versions represent different data type variations of the same attribute path. DataPancake creates **all 7 polymorphic versions** for every attribute when first discovered, regardless of the discovered type.

***

## The 7 Polymorphic Versions

DataPancake creates these 7 versions for every attribute:

* `str` — String values
* `int` — Integer values
* `float` — Floating-point numbers
* `bool` — Boolean values
* `object` — Nested objects
* `array_primitive` — Arrays of primitives (strings, numbers, booleans)
* `array_object` — Arrays of objects

***

***

## Proactive Version Creation

**When an attribute is first discovered:**

1. All 7 polymorphic version records are created in `core.datasource_attribute_polymorphic_version`
2. Each version is created with `VERSION_STATUS = 'inactive'`
3. Only versions matching the discovered data type are set to `VERSION_STATUS = 'active'`
4. Inactive versions remain ready for activation in future scans

**Why:** Enables seamless schema evolution—when a new data type appears, the version already exists and can be activated without creating new records.

***

## Naming Convention

Polymorphic versions are named in `POLYMORPHIC_ATTRIBUTE_NAME`:

* Primitive types: `{attribute_name}_{type}` (e.g., `status_str`, `price_int`, `rating_float`, `is_active_bool`)
* Object: `{attribute_name}_object` (e.g., `address_object`)
* Arrays: `{attribute_name}_array_{array_type}` (e.g., `tags_array_primitive`, `orders_array_object`)

***

## Example: Polymorphic Version Creation

When DataPancake first discovers an attribute path `property_type` as a string:

What actually happens:

{% stepper %}
{% step %}

#### Initial discovery

* DataPancake creates the attribute record for `property_type`.
* Proactively creates all 7 polymorphic versions:
  * `property_type_str` → **activated** (matches discovered type)
  * `property_type_int` → inactive
  * `property_type_float` → inactive
  * `property_type_bool` → inactive
  * `property_type_object` → inactive
  * `property_type_array_primitive` → inactive
  * `property_type_array_object` → inactive
    {% endstep %}

{% step %}

#### Later: data becomes polymorphic

If a new record contains `property_type` as an object:

```json
{"property_type": {"usage": "Commercial", "sq_ft": 20000}}
```

* DataPancake detects the new data type during scanning.
* Activates the existing `property_type_object` version (it was already created).
* Updates the version status from `'inactive'` to `'active'` and sets the version status date.

Result: Both versions are now active:

* `property_type_str` (active)
* `property_type_object` (active)
  {% endstep %}
  {% endstepper %}

***

## Version Status

**`VERSION_STATUS` values:**

* `'active'` — Data type found in data; included in code generation
* `'inactive'` — Version exists but not yet found; ready for activation

**Status transitions:**

* Activated when corresponding data type is discovered in scans
* May become inactive if data type disappears (schema evolution)
* Only `active` versions are included in SQL code generation

***

## Why Proactive Creation?

Semi-structured data often has polymorphic paths (same path, different types across records). By creating all 7 versions upfront:

* Schema evolution is seamless—no new records needed, just activation
* Consistent metadata structure across all attributes
* Zero technical debt from reactive version creation

***

## Managing Versions

**Review versions:**

* Check all 7 versions for each attribute in the UI
* Monitor `VERSION_STATUS_DATE` to track schema evolution
* Use `INCLUDE_IN_CODE_GEN` to control which active versions generate columns

**Handling polymorphism:**

* Use transformation expressions to handle type variations
* Set `INCLUDE_IN_CODE_GEN = FALSE` for versions you don't need
* Monitor newly activated versions after scans


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.datapancake.com/core-concepts/attribute-metadata/polymorphic-versions.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
