# Attribute Consolidation

## Overview

When [Schema Transformations](https://docs.datapancake.com/core-concepts/data-sources/schema-transformations) consolidate attribute paths during scanning (e.g., `customerInfo` → `customer`), the discovered path changes but source data retains the original path. Attribute Consolidation provides SQL expressions to extract values from the original paths during Dynamic Table code generation.

**Supported:** Only for `Discovered` and `Schema` attribute types. Virtual attributes do not support consolidation.

**Required:** Both `SCHEMA_INSERT_REGULAR_EXPRESSION_SEARCH` and `SCHEMA_INSERT_SQL_EXPRESSION` must be non-empty for consolidation to be applied.

***

## Configuration Fields

### Schema Consolidation Search

**Field:** `SCHEMA_INSERT_REGULAR_EXPRESSION_SEARCH`

**Behavior:** Uses Python `find()` (string search, not regex) to locate the first occurrence of the search value in the attribute path. Despite the field name, this is a literal string match, not a regular expression.

**Requirements:**

* Must be a literal string that appears in the consolidated attribute path
* Matches the portion that was replaced during schema consolidation
* Case-sensitive

**Example:** If schema transformation replaced `customerInfo` with `customer`, use `Info` to find where consolidation occurred.

### Schema Consolidation SQL Expression

**Field:** `SCHEMA_INSERT_SQL_EXPRESSION`

**Behavior:** SQL expression inserted into `GET_PATH` with `CONCAT` at the location where the search value was found. Evaluates to a string representing the original path segment.

**Requirements:**

* Valid SQL expression that evaluates to a string
* Typically a literal string (e.g., `'customerInfo'`)
* Can use dynamic SQL expressions for conditional paths

**Example:** `'customerInfo'` provides the original path segment that was consolidated.

***

## How It Works

### Code Generation Process

1. **Search:** Locates first occurrence of `SCHEMA_INSERT_REGULAR_EXPRESSION_SEARCH` in the attribute path using string search
2. **Insert:** Constructs `GET_PATH` with `CONCAT`, inserting `SCHEMA_INSERT_SQL_EXPRESSION` at the match location
3. **Generate:** Produces SQL that extracts from the original source path

### Generated SQL Pattern

**Without Consolidation:**

```sql
json_data:"data":"customer"::VARCHAR
```

**With Consolidation:**

```sql
GET_PATH(json_data, CONCAT('"data"."', 'customerInfo', '."'))::VARCHAR
```

The SQL expression is inserted where the search value was found, reconstructing the original path for extraction.

**Note:** Works with both standard JSON paths and embedded JSON paths (paths containing `embedded_json`).

***

## Relationship to Schema Transformations

Attribute Consolidation works with [Schema Transformations](https://docs.datapancake.com/core-concepts/data-sources/schema-transformations) configured at the data source level:

**Schema Transformation (Data Source Level):**

* **Type:** `consolidate schema`
* **Search Expression:** Regex pattern to find paths (e.g., `customerInfo`)
* **Replace Expression:** Replacement pattern (e.g., `customer`)
* **Result:** Paths like `data.customerInfo` become `data.customer` during discovery

**Attribute Consolidation (Attribute Level):**

* **Search:** String to locate in consolidated path (e.g., `Info`)
* **SQL Expression:** Original path segment (e.g., `'customerInfo'`)
* **Result:** Code generation extracts from `data.customerInfo` in source data

**Coordination:** The search value should match the portion that was replaced. The SQL expression should provide the original path segment.

**Example Alignment:**

* Schema Transformation: Search `customerInfo`, Replace `customer` → consolidates `data.customerInfo` → `data.customer`
* Attribute Consolidation: Search `Info`, SQL Expression `'customerInfo'` → extracts from original `data.customerInfo`

***

## Interaction with Other Settings

**Order of Application:**

1. Attribute Consolidation extracts value from original path
2. Transformation Expression (if configured) transforms the extracted value
3. Null Value Expression handles nulls (if transformation type is not "SQL Expression")

**Notes:**

* Null Value Expression and Transformation Expression reference the extracted value using `{attribute_name}`
* Foreign keys reference consolidated attribute paths, not original paths
* DateTime format conversion applies after extraction
* `INCLUDE_IN_CODE_GEN` must be enabled for consolidation to be applied

***

## Examples

### Example 1: Simple String Replacement

**Schema Transformation:**

* Search: `customerInfo`, Replace: `customer`
* Consolidates: `data.customerInfo.email` → `data.customer.email`

**Attribute Consolidation:**

* Search: `Info`
* SQL Expression: `'customerInfo'`

**Generated SQL:**

```sql
GET_PATH(json_data, CONCAT('"data"."', 'customerInfo', '."email"'))::VARCHAR AS customer_email
```

### Example 2: Dynamic Path Construction

**Schema Transformation:**

* Removes `Info` suffix from multiple attributes

**Attribute Consolidation:**

* Search: `Info`
* SQL Expression: `CONCAT(SPLIT_PART('{attribute_name}', '.', 2), 'Info')`

**Result:** Dynamically reconstructs original path based on consolidated path.

### Example 3: Conditional Path Selection

**Attribute Consolidation:**

* Search: `Info`
* SQL Expression: `COALESCE('customerInfo', 'customer')`

**Result:** Tries original path first, falls back to consolidated path if not found.

***

## Troubleshooting

**Consolidation not applied:**

* Verify both fields are non-empty
* Check search value exists in the consolidated path (case-sensitive)
* Ensure attribute type is `Discovered` or `Schema` (not `Virtual`)

**NULL values in materialized tables:**

* Verify original path exists in source data
* Test SQL expression independently
* Review schema transformation configuration

**SQL syntax errors:**

* Verify SQL expression syntax is valid
* Check proper quoting/escaping in SQL expression
* Review generated SQL in code output

***

## Summary

Attribute Consolidation bridges the gap between consolidated discovery paths and original source data paths. Configure [Schema Transformations](https://docs.datapancake.com/core-concepts/data-sources/schema-transformations) first, then set attribute consolidation to match. The search uses string matching (not regex) to locate where consolidation occurred, and the SQL expression provides the original path segment for extraction.

**For more information:** [Schema Transformations](https://docs.datapancake.com/core-concepts/data-sources/schema-transformations) | [Attributes: Metadata](https://docs.datapancake.com/core-concepts/attribute-metadata)
