# Attribute Discovery Process

## Discovery Process

1. **Recursively traverse records** - Every nested level is visited
2. **Track attribute paths** - Complete paths from root to leaf (e.g., `customer.contact.email`)
3. **Identify data types** - Source data type inferred for each path occurrence
4. **Create all 7 polymorphic versions** - Proactively created when attribute first discovered
5. **Activate matching versions** - Only versions matching discovered types set to `VERSION_STATUS = 'active'`
6. **Create attribute records** - Records created in `core.datasource_attribute` and `core.datasource_attribute_polymorphic_version`

***

## What Gets Discovered

* **All attribute paths** - Complete paths from root to leaf (e.g., `customer.contact.email`)
* **Nested objects** - Every level of object nesting
* **Nested arrays** - Both object arrays (`ARRAY_TYPE = 'object'`) and primitive arrays (`ARRAY_TYPE = 'primitive'`)
* **Embedded JSON** - JSON stored as strings (`HAS_EMBEDDED_CONTENT = TRUE`); recursively parsed
* **Polymorphic variations** - All data type variations for the same path (handled via polymorphic versions)

***

## Example: Polymorphic Discovery

**Record 1:**

```json
{
  "customer_id": "C001",
  "customer": {
    "address": "123 Main St, Anytown, ST 12345",
    "metadata": "{\"source\":\"web\",\"tags\":[\"vip\"]}"
  }
}
```

**Record 2:**

```json
{
  "customer_id": "C002",
  "customer": {
    "address": {"street": "456 Oak Ave", "city": "Springfield"},
    "metadata": "{\"source\":\"mobile\",\"tags\":[\"new\"]}"
  }
}
```

**Discovery Results:**

* `customer.address`:
  * Record 1: `str` → `address_str` activated
  * Record 2: `object` → `address_object` activated (existing version)
* `customer.metadata`:
  * `HAS_EMBEDDED_CONTENT = TRUE`
  * Recursively parsed to discover: `metadata.source`, `metadata.tags[]`
