Attribute Discovery Process
How DataPancake discovers attributes during scanning, including polymorphic detection and recursive parsing of stringified JSON.
How Attributes Are Discovered
Important: DataPancake doesn't wait to detect polymorphism—it proactively accounts for all 7 possible polymorphic variations upfront. When an attribute is first discovered, all 7 versions are created immediately, with only the matching version(s) activated. This ensures the system is always ready for any polymorphic variation that may occur.
What Gets Discovered
All attribute paths — complete paths from root to leaf (e.g.,
customer.contact.email)Nested objects — every level of object nesting
Nested arrays — both object arrays and primitive arrays
Embedded JSON — JSON stored as strings (stringified/escaped JSON)
Polymorphic variations — all data type variations for the same path
Example Discovery
Given multiple records with polymorphic data and stringified JSON:
Record 1:
{
"customer_id": "C001",
"customer": {
"name": "John Doe",
"address": "123 Main St, Anytown, ST 12345",
"contact": {
"email": "[email protected]",
"phone": "555-1234"
},
"metadata": "{\"source\":\"web\",\"campaign\":\"summer2024\",\"tags\":[\"vip\",\"premium\"]}"
},
"orders": [
{"order_id": "O001", "total": 100.50},
{"order_id": "O002", "total": 250.75}
]
}Record 2:
What DataPancake Discovers:
Standard Attributes:
customer_id(string)customer(object)customer.name(string)customer.contact(object)customer.contact.email(string)customer.contact.phone(string)orders(array)orders[0].order_id(string)orders[0].total(number)
Polymorphic Address:
customer.address appears as:
Record 1: string →
address_stringactivatedRecord 2: object →
address_objectactivated (existing version, no new record needed)
Stringified JSON:
customer.metadata contains stringified JSON. DataPancake:
Detects embedded content (
HAS_EMBEDDED_CONTENT = TRUE)Recursively parses to discover:
metadata.source,metadata.campaign,metadata.tags[]
Result: Both address_string and address_object are active. All 7 versions were created upfront—polymorphism is handled by activating existing versions, not creating new ones.
Last updated
Was this helpful?