Attribute Consolidation

Use Attribute Consolidation to unify or standardize attributes from multiple data sources. This helps streamline paths, reduce duplication, and ensure consistent naming.

Overview

Attribute Consolidation allows you to configure how individual attributes extract values from source data when schema consolidations have been applied at the data source level. When schema transformations consolidate attribute paths during scanning (e.g., customerInfocustomer), the discovered attribute path changes, but the actual data still contains the original path. Attribute Consolidation provides the SQL expressions needed to extract values from the original consolidated paths during Dynamic Table code generation.

Attribute Consolidation works in conjunction with Schema Transformations configured at the data source level. While schema transformations modify paths during discovery, attribute consolidation ensures code generation can correctly extract values from the original source paths.


How It Works

Two-Level Consolidation Process

Level 1: Schema Consolidation (Data Source Level)

  • Applied during scanning/discovery

  • Modifies attribute paths using regular expressions

  • Example: data.customerInfodata.customer (removes Info)

Level 2: Attribute Consolidation (Attribute Level)

  • Applied during code generation

  • Provides SQL to extract from original paths

  • Example: Finds customerInfo in original data to get the value

The Problem It Solves

When schema consolidation transforms paths during discovery:

  • Discovered Path: data.customer (consolidated)

  • Actual Data Path: data.customerInfo (original)

  • Code Generation Needs: SQL to extract from data.customerInfo

Attribute Consolidation bridges this gap by providing the SQL expression to reference the original path.


Consolidation Settings

Field: SCHEMA_INSERT_REGULAR_EXPRESSION_SEARCH

Purpose:

  • Regular expression pattern that identifies the portion of the path that was replaced during schema consolidation

  • Used to locate where the consolidation occurred in the attribute path

  • Works with the SQL Expression to construct the extraction path

Requirements:

  • Must match the part of the path that was replaced by schema consolidation

  • Should be the literal string or regex pattern that was removed/replaced

  • Used during code generation to build GET_PATH expressions

Examples:

Simple String Match:

  • If schema consolidation replaced customerInfo with customer

  • Search: Info

  • Matches the Info portion that was removed

Regex Pattern:

  • If schema consolidation replaced various patterns

  • Search: Info|Details|Data

  • Matches any of these patterns

Path Segment:

  • If consolidation affected a specific path segment

  • Search: \.customerInfo\.

  • Matches the full segment including delimiters

Best Practice: Use the exact string or pattern that was replaced during schema consolidation. Test the pattern to ensure it matches correctly.

Schema Consolidation SQL Expression

Field: SCHEMA_INSERT_SQL_EXPRESSION

Purpose:

  • SQL expression that provides the original path segment or value to use in GET_PATH

  • Used to construct the path for extracting values from the original consolidated structure

  • Inserted into GET_PATH CONCAT expression during code generation

Requirements:

  • Must be valid SQL that evaluates to a string

  • Typically a literal string representing the original path segment

  • Can be a SQL expression if dynamic path construction is needed

Examples:

Simple Literal:

  • If original path had customerInfo

  • Expression: 'customerInfo'

  • Directly provides the original segment

Dynamic Expression:

  • If path varies based on conditions

  • Expression: CASE WHEN condition THEN 'customerInfo' ELSE 'customer' END

  • Provides conditional path selection

Pattern-Based:

  • If extracting from multiple possible paths

  • Expression: COALESCE('customerInfo', 'customerDetails', 'customer')

  • Tries multiple path variations

Best Practice: Start with simple literal strings. Use expressions only when paths vary dynamically.


How Consolidation is Applied

During Code Generation

When Attribute Consolidation is configured, the code generation process:

  1. Identifies Consolidation Point

    • Searches the attribute path for the SCHEMA_INSERT_REGULAR_EXPRESSION_SEARCH pattern

    • Locates where consolidation occurred in the path

  2. Constructs GET_PATH Expression

    • Uses GET_PATH with CONCAT to build the extraction path

    • Inserts the SCHEMA_INSERT_SQL_EXPRESSION at the consolidation point

    • Constructs path referencing original data structure

  3. Generates SQL

    • Produces SQL that extracts from original consolidated paths

    • Ensures values are correctly retrieved despite path consolidation

Generated SQL Pattern

Without Consolidation:

With Consolidation:

The consolidation SQL expression ('customerInfo') is inserted where the search pattern matched, allowing extraction from the original path.


Relationship to Schema Transformations

Schema Transformation Configuration

Attribute Consolidation works with Schema Transformations configured at the data source level:

Schema Transformation (Data Source):

  • Transformation Type: consolidate schema

  • Search Expression: customerInfo (finds paths to consolidate)

  • Replace Expression: customer (replaces with consolidated name)

  • Result: Paths like data.customerInfo become data.customer during discovery

Attribute Consolidation (Attribute):

  • Search: Info (finds where consolidation occurred)

  • SQL Expression: 'customerInfo' (provides original path segment)

  • Result: Code generation extracts from data.customerInfo in source data

Coordination Required

Matching Patterns:

  • Schema transformation search/replace must align with attribute consolidation search

  • Attribute consolidation search should match what was replaced

  • SQL expression should provide the original path segment

Example Alignment:

Schema Transformation:

  • Search: customerInfo

  • Replace: customer

  • Consolidates: data.customerInfodata.customer

Attribute Consolidation:

  • Search: Info (or customerInfo depending on implementation)

  • SQL Expression: 'customerInfo'

  • Extracts from: Original data.customerInfo path

Best Practice: Configure schema transformations first, then set attribute consolidation to match. Document the relationship between the two.


Use Cases

Unifying Multiple Data Sources

Scenario:

  • Multiple data sources have similar attributes with different naming

  • Source A: data.customerInfo

  • Source B: data.customerDetails

  • Source C: data.customer

Solution:

  1. Schema Transformation: Consolidate all to data.customer

  2. Attribute Consolidation: Configure per attribute to extract from original paths

  3. Result: Unified schema with correct value extraction

Handling Schema Evolution

Scenario:

  • Legacy data uses customerInfo, new data uses customer

  • Want to consolidate to customer for consistency

  • Need to extract from both original paths

Solution:

  1. Schema Transformation: Consolidate customerInfocustomer

  2. Attribute Consolidation: Use SQL expression to handle both paths

  3. SQL Expression: COALESCE('customerInfo', 'customer')

Normalizing Naming Conventions

Scenario:

  • Inconsistent naming across attributes

  • customerInfo, orderInfo, productInfo

  • Want to standardize to customer, order, product

Solution:

  1. Schema Transformation: Remove Info suffix across all

  2. Attribute Consolidation: Configure per attribute with original names

  3. Result: Normalized schema with correct extraction


Interaction with Other Attribute Settings

Null Value Expression

Relationship:

  • Null Value Expression applies after value extraction

  • Consolidation affects the path used for extraction

  • Null expression should reference the consolidated attribute name

Example:

  • Consolidated Path: data.customer

  • Null Expression: COALESCE({attribute_name}, 'Unknown')

  • Result: Uses consolidated path, null handling applies after extraction

Best Practice: Null expressions work with consolidated paths. Ensure they reference the correct consolidated attribute name.

Transformation Type and Expression

Relationship:

  • Transformation Expression applies after value extraction

  • Consolidation provides the extraction path

  • Transformation works on the extracted value

Order of Application:

  1. Attribute Consolidation extracts value from original path

  2. Transformation Expression transforms the extracted value

  3. Null Value Expression handles nulls (if not SQL Expression transformation)

Example:

  • Consolidation: Extracts from data.customerInfo

  • Transformation: UPPER({attribute_name})

  • Result: Value extracted, then uppercased

Best Practice: Transformation expressions work with consolidated values. Use {attribute_name} to reference the extracted value.

Foreign Keys

Impact:

  • Foreign keys reference attribute paths

  • Consolidation changes attribute paths

  • Foreign key definitions may need updates

Considerations:

  • Verify foreign key paths match consolidated paths

  • Update foreign key definitions if paths changed

  • Test relationships after consolidation

Best Practice: Review and update foreign key definitions after applying consolidations. Ensure relationships reference consolidated paths.

Use DateTime Format

Relationship:

  • DateTime format applies to extracted values

  • Consolidation affects extraction path

  • Format conversion happens after extraction

Example:

  • Consolidation: Extracts from data.customerInfo.createdDate

  • DateTime Format: Applied to extracted date value

  • Result: Date extracted from consolidated path, then formatted

Best Practice: DateTime formats work with consolidated extraction. Ensure date paths are correctly consolidated.

Attributes Grid Integration

Include In Code Gen:

  • Controls whether consolidated attribute appears in generated code

  • Consolidation affects how attribute is extracted

  • Must be enabled for consolidation to be applied

Alias:

  • Alias provides final column name

  • Consolidation affects extraction path, not column name

  • Use alias to clarify consolidated attribute purpose

Best Practice: Enable "Include In Code Gen" for attributes using consolidation. Use aliases to document consolidated attributes.


Configuration Workflow

Step 1: Configure Schema Transformation

  1. Navigate to Data Source management

  2. Configure Schema Transformation with type "consolidate schema"

  3. Set Search Expression to find paths to consolidate

  4. Set Replace Expression to define consolidated path

  5. Save Schema Transformation

Step 2: Perform Scan

  1. Run scan with schema transformation active

  2. Verify attributes are discovered with consolidated paths

  3. Review discovered attributes for correct consolidation

Step 3: Configure Attribute Consolidation

  1. Navigate to Pipeline Designer (Attributes management)

  2. Select attribute that was consolidated

  3. Configure Schema Consolidation Search:

    • Enter pattern matching what was replaced

    • Test pattern matches correctly

  4. Configure Schema Consolidation SQL Expression:

    • Enter SQL providing original path segment

    • Typically a literal string

  5. Save attribute configuration

Step 4: Generate Code

  1. Generate SQL code for Dynamic Tables

  2. Review generated SQL for correct GET_PATH expressions

  3. Verify consolidation expressions are applied

  4. Test generated code with sample data


Examples

Example 1: Simple String Replacement

Scenario:

  • Schema consolidation: customerInfocustomer

  • Original path: data.customerInfo.email

  • Consolidated path: data.customer.email

Attribute Consolidation Configuration:

  • Search: Info

  • SQL Expression: 'customerInfo'

Generated SQL:

Result: Extracts from original data.customerInfo.email path despite consolidated discovery.

Example 2: Multiple Path Variations

Scenario:

  • Schema consolidation: Various *Info → base name

  • Original paths: data.customerInfo, data.orderInfo, data.productInfo

  • Consolidated paths: data.customer, data.order, data.product

Attribute Consolidation Configuration:

  • Search: Info

  • SQL Expression: CONCAT(SPLIT_PART('{attribute_name}', '.', 2), 'Info')

Generated SQL (for customer):

Result: Dynamically constructs original path based on consolidated path.

Example 3: Conditional Path Selection

Scenario:

  • Schema consolidation: customerInfocustomer

  • Some records have customerInfo, others have customer

  • Need to handle both

Attribute Consolidation Configuration:

  • Search: Info

  • SQL Expression: COALESCE('customerInfo', 'customer')

Generated SQL:

Result: Tries customerInfo first, falls back to customer if not found.


Best Practices

Pattern Matching

Test Regular Expressions:

  • Validate search patterns with sample paths

  • Use regex testing tools before saving

  • Ensure patterns match intended paths only

  • Avoid overly broad patterns that match unintended paths

Specific Patterns:

  • Use specific patterns to avoid unintended matches

  • Include path delimiters when needed (e.g., \.customerInfo\.)

  • Test with actual attribute paths from your data

SQL Expression Design

Start Simple:

  • Begin with literal string expressions

  • Use 'originalPath' format for most cases

  • Add complexity only when needed

Dynamic Expressions:

  • Use SQL expressions when paths vary

  • Test expressions with sample data

  • Document complex expression logic

Documentation

Document Consolidation Logic:

  • Record why consolidation was applied

  • Document search patterns and SQL expressions

  • Maintain mapping of original to consolidated paths

  • Update documentation when consolidations change

Testing

Test Before Production:

  • Test consolidation with sample data

  • Verify extraction works correctly

  • Review generated SQL code

  • Validate with actual data queries

Incremental Approach:

  • Start with one attribute

  • Test and verify before expanding

  • Gradually apply to more attributes

  • Monitor for issues

Performance Considerations

Efficient Patterns:

  • Use simple string matching when possible

  • Avoid complex regex patterns in high-volume scenarios

  • Consider performance impact of dynamic SQL expressions

  • Test with production-scale data volumes

Optimization:

  • Consolidate only where necessary

  • Use targeted patterns, not broad matches

  • Monitor code generation performance

  • Optimize expressions for efficiency


Troubleshooting

Values Not Extracting Correctly

Symptoms:

  • Generated SQL doesn't extract values

  • NULL values in materialized tables

  • Incorrect data in consolidated columns

Solutions:

  • Verify search pattern matches consolidated path segment

  • Check SQL expression provides correct original path

  • Test GET_PATH expression manually

  • Review schema transformation configuration

  • Verify original path exists in source data

Pattern Not Matching

Symptoms:

  • Consolidation not applied in generated code

  • Original path extraction used instead

  • No GET_PATH expression generated

Solutions:

  • Verify search pattern syntax is correct

  • Test pattern with actual attribute paths

  • Check for case sensitivity issues

  • Ensure pattern matches what was replaced

  • Review schema transformation replace expression

Incorrect SQL Generation

Symptoms:

  • GET_PATH expression malformed

  • SQL syntax errors in generated code

  • Expression doesn't evaluate correctly

Solutions:

  • Verify SQL expression syntax

  • Test expression independently

  • Check for proper quoting/escaping

  • Review CONCAT expression structure

  • Validate with sample data


Summary

Purpose:

  • Extract values from original paths after schema consolidation

  • Bridge gap between consolidated discovery and original data structure

  • Enable correct value extraction during code generation

Configuration:

  • Schema Consolidation Search (regex pattern)

  • Schema Consolidation SQL Expression (SQL for original path)

Relationship:

  • Works with Schema Transformations at data source level

  • Applied during Dynamic Table code generation

  • Uses GET_PATH with CONCAT to construct extraction paths

Best Practice: Configure schema transformations first, then set attribute consolidation to match. Test thoroughly and document consolidation logic.

For information on schema transformations, see Schema Transformations. For attribute metadata, see Attributes: Metadata.

Last updated

Was this helpful?