# Schema Transformations

Configuration for transforming source schema during materialization process.

## Overview

Schema Transformations (also called Schema Consolidations) allow you to modify the source schema during the materialization process. Transformations are applied during scanning and code generation, affecting how attributes are discovered and how SQL code is generated.

***

## Transformation Types

The type of transformation to apply to the schema.

### Consolidate Schema

**Purpose:** Consolidate multiple attribute name variations into a single name using regular expressions.

**Configuration:**

* **Search Expression** - Regular expression to find attribute names
* **Replace Expression** - Replacement pattern for matched names
* **Attribute Path Search** - Optional path filter to limit transformation scope

### Convert Object to Array

**Purpose:** Convert object attributes to array attributes (transform single objects into arrays of objects).

**Configuration:**

* **Source Data Type** - `object` (required)
* **Transform Data Type** - `array` (result)
* **Attribute Path Search** - Path to object attribute to transform

### Transform Objects to Array

**Purpose:** Transform multiple object attributes into a single array (consolidate related objects into array structure).

**Configuration:**

* **Source Data Type** - `object`
* **Transform Data Type** - `array`
* **Attribute Path Search** - Path pattern to match objects

### Transform String to Array

**Purpose:** Convert string attributes containing delimited values into arrays.

**Configuration:**

* **Source Data Type** - `str` (required)
* **Transform Data Type** - `array` (result)
* **String List Type** - Parsing method (primitive, key\_value, fixed\_width)
* Additional parameters based on String List Type

***

## String List Types

When transforming strings to arrays, specify how the string should be parsed.

### Primitive

**Purpose:** Simple delimited list of primitive values.

**Configuration:**

* **Delimiter Value** - Character(s) separating values (e.g., `,`, `|`, `;`)

**Example:**

* Input: `"apple,banana,cherry"`, Delimiter: `,`, Output: `["apple", "banana", "cherry"]`

### Key Value

**Purpose:** String containing key-value pairs parsed into array of objects.

**Configuration:**

* **Key Value Separator** - Separator between key and value (e.g., `=`, `:`)
* **Delimiter Value** - Separator between pairs (e.g., `,`, `;`)

**Example:**

* Input: `"name=John,age=30,city=NYC"`, Key Value Separator: `=`, Delimiter: `,`, Output: `[{"name": "John"}, {"age": "30"}, {"city": "NYC"}]`

### Fixed Width

**Purpose:** String with fixed-width fields parsed based on character positions.

**Configuration:**

* **Field Widths** - Comma-separated list of field widths (e.g., `10,5,15`)

**Example:**

* Input: `"John 30NYC "`, Field Widths: `10,5,15`, Output: Parsed into array based on positions

***

## Regular Expression Patterns

Regular expressions are used for schema consolidation and attribute path filtering.

### Search Expression

**Purpose:** Regular expression pattern to match attribute names (used in Consolidate Schema transformations).

**Examples:**

* `^customer_?id$` - Matches `customer_id`, `customerid`
* `.*[Tt]imestamp.*` - Matches any attribute with "timestamp" or "Timestamp"
* `^[a-z]+_[a-z]+$` - Matches snake\_case attributes

### Replace Expression

**Purpose:** Replacement pattern for matched attribute names (used with Search Expression in Consolidate Schema).

**Examples:**

* Search: `^customer_?id$`, Replace: `customer_id`
* Search: `.*[Tt]imestamp.*`, Replace: `timestamp`
* Search: `(.*)_(.*)`, Replace: `$1_$2` (standardize format)

### Attribute Path Search

**Purpose:** Optional path filter to limit transformation scope (applies transformation only to matching attribute paths).

**Examples:**

* `customer.*` - Apply to all customer.\* attributes
* `.*\.items\[.*\]` - Apply to items array elements
* `root\.metadata\..*` - Apply to metadata.\* attributes

***

## Transformation Configuration

### Consolidation Name

**Purpose:** Descriptive name for the transformation (required field for all transformations).

### Source Data Type

**Purpose:** Original data type before transformation (required for type transformations).

**Options:** `str`, `int`, `float`, `bool`, `object`, `array`

**Usage:**

* Required for Convert Object to Array
* Required for Transform Objects to Array
* Required for Transform String to Array

### Transform Data Type

**Purpose:** Target data type after transformation (required for type transformations).

**Options:** `str`, `int`, `float`, `bool`, `object`, `array`

**Usage:** Specifies result type for transformations (must be compatible with transformation type).

### Additional Parameters

**Wrapper Value:** Characters wrapping values in strings (used for string parsing).

**Delimiter Value:** Character(s) separating values (used in Primitive and Key Value string parsing). Common: `,`, `|`, `;`, `\t`

**Key Value Separator:** Separator between keys and values (used in Key Value string parsing). Common: `=`, `:`, `->`

**Field Widths:** Comma-separated list of field widths (used in Fixed Width string parsing). Example: `10,5,15` for three fields

***

## Managing Transformations

**Adding Transformations:**

* Navigate to data source management
* Scroll to Schema Transformation section
* Click `+` symbol to add new row
* Configure transformation settings
* Click "Save Schema Transformations"

**Editing Transformations:**

* Find transformation in grid
* Edit fields directly in grid
* Changes are saved when "Save Schema Transformations" clicked
* System validates changes before saving

**Deleting Transformations:**

* Move mouse to leftmost column
* Checkbox appears for row selection
* Check box for transformation to delete
* Press Delete key
* Click "Save Schema Transformations" to confirm

{% hint style="warning" %}
Important: Transformations are applied in the order they appear in the grid. Order matters when multiple transformations affect the same attributes.
{% endhint %}

***

## Summary

**Transformation Types:**

* Consolidate Schema (name consolidation)
* Convert Object to Array
* Transform Objects to Array
* Transform String to Array

**String List Types:**

* Primitive (simple delimited)
* Key Value (key-value pairs)
* Fixed Width (position-based)

**Configuration:**

* Regular expression patterns
* Data type specifications
* Parsing parameters
* Attribute path filters

**Important:** Transformations are applied in the order they appear in the grid. Order matters when multiple transformations affect the same attributes.

For schema filtering, see [Schema Filters](https://docs.datapancake.com/core-concepts/data-sources/schema-filters). For source object configuration, see [Source Object Settings](https://docs.datapancake.com/core-concepts/data-sources/source-object-settings).
