Release Notes

DataPancake™ V1.47-73 Release Notes

🚀 New Features

Structured Data Source Support Data sources now support both Semi-Structured and Structured types. Structured sources can leverage secure semantic layers using the Data Dictionary and Semantic Model Generator.
Virtual Attribute Enhancements
- Virtual attributes can now include a description.
- Attributes categorized as metric or filter (via the "W Question Category") are now auto-included in semantic model generation.
- Attributes can now be edited directly.
Semantic Model Generator
- New filter added for "W Question Category" to refine model output.
Pipeline Designer Updates
- New comment fields for both raw and semantic layer transformation expressions.
- Comments are now visible in the transformation view.
- Snowflake data types now reflect primitive array types such as ARRAY<VARCHAR>.
Warehouse Configuration Enhancements
- New Record Status field with default set to Active.
- Default baseline scan settings set to 10,000 rows/min with 8 threads (adjustable in the Data Source screen).
New Public Procedure add_datasource_with_scan_and_features_paid: Allows users to programmatically create a data source with scans, deployment DB/schema, and additional configurations.

🛠️ Enhancements & Improvements

UI/UX

Left Navigation Updates
- Attribute Metadata → Pipeline Designer
- Data Dictionary → Data Dictionary Builder
- Semantic Model → Semantic Model Generator
Navigation Bar Fix Resolved visibility issues on smaller screens for:
- Pipeline Designer
- Semantic Model Generator
- Data Sources
Data Dictionary Performance Improved performance when saving changes.
Virtual Attribute Feature now supports descriptions and updating of existing attributes.

📊 Cortex Analyst Semantic Model Improvements

Bug Fixes
- Resolved issue where an empty Verified Date was preventing a new verify query record from being inserted.
- Prevented duplicate foreign keys from being inserted into the generated YAML file when alias is used across multiple arrays.
- Table and column names in YAML are now generated in uppercase by default unless the table or column name is case sensitive.
New View: vw_datasource_attribute_genai Provides visibility into all semantic model metadata.

🔐 Security & Governance

New Application Database Metadata Views
- Security - vw_datasource_attribute_security
- Data Dictionary - vw_datasource_attribute_dictionary
- Transformation - vw_datsource_attribute_transformation
Access Control Enhancements
- Added application-level RBAC using new access control tables, procedures, and views.
- New role: app_admin
  - Can call add_user(username, can_initiate_scans BOOLEAN) procedure to manage scan permissions.
- First security feature now allows admins to restrict scan initiation per user.
Manifest Updates
- Logging now limited to error-level only (reduced trace-level logging).
- Manifest now requests additional privileges, including access to current session user for use with app RBAC and future auditing features to track who is making changes to metadata.
Storage Optimization
- Added TRANSIENT keyword to dynamically generated SQL for dynamic tables to reduce storage cost by disabling time travel.

🧠 Scanning & Discovery

Data Type Defaults
- Boolean types now default to Dimension in "W Question Category".
Expanded Format Support
- Added support for scanning string columns (not just VARIANT).
- JSON discovery now supported in Iceberg string columns with dynamic table generation.

🧬 Code Generation Improvements

Dynamic Table Alias Behavior
- Nested attributes now replace the full path instead of appending to the end, improving naming clarity.
Decimal Type Handling
- Fixed bug where 0 values in DECIMAL columns were incorrectly materialized as NULL.
- Code now checks for 0 and casts explicitly as DECIMAL.

⚠️ Bug Fixes

Semantic Model Designer
- Resolved error when creating a verified query without a verified date.
Scan Execution Handling
- When a scan fails due to missing USAGE privilege on a replaced warehouse, a user-friendly message is now returned along with the required GRANT USAGE SQL statement.

DataPancake™ V1.46 Release Notes

Value- Based Pricing Model for Billing

New features have been added to the DataPancake offering. The billing model for DataPancake has been updated to create billable events monthly based on the feature used and the number of attributes contained in a particular data source.

Features offered are:

Schema Discovery (formerly known as Schema Summary) - Free

Attribute Metadata Management (known formerly as Schema Analysis) - Paid

SQL Code Generation (formerly known as Extract, Relate, and Flatten) - Paid

Data Dictionary Builder - Paid

Security Policy Integration - Paid

Cortex Analyst Semantic Model Code Generation - Paid

Feature

Price per Attribute

Attribute Metadata Management

$0.50

SQL Code Generation

$0.90

Data Dictionary Builder

$0.30

Security Policy Integration

$0.75

Semantic Model Code Generation

$0.50

Data Source Overview

New columns have been added to the data source overview grid to indicate which services have been enabled for each data source.

Scans in Process will now show a more accurate estimate of the amount of time it will take to complete a scan based on the baseline scan settings. The baseline scan settings can be reset in the Data Source screen. The next scan completed will recompute the average number of records processed based on the number of threads used during the scan. Each subsequent scan will use the baseline scan settings to estimate the time needed to complete the scan.

A new left hand navigation menu has been added to better organize the available features in the application.

Quick Start Script Builders

Three new script builders have been created.

Individual Data Sources

Multiple Data Sources

Schema Drift Alert

Data Source

DataPancake Services

Users will now have the option to enable specific services for each data source. Services replace the option to select a specific Product Tier.

Column Data Type

Users will now have the option to select a source column data type. Currently the only option is Variant. In a future release users will have the option to select String.

Sample Schema Data

Users can supply a single document to represent the schema for a specific data source. This sample document can be in either a JSON or XML format. For example, users can create a sample XML document from an existing XSD file or create a sample JSON document from an existing Avro Schema. Once the sample has been configured users can configure a scan to create attribute metadata from the provided sample instead of through data discovery.

Output Object Settings

Users can choose the type of output object. Current options include Dynamic Table or Table.

Dynamic Table Metadata

Users can now configure a data source to generate the code necessary to track Dynamic Table metadata including the insert and last updated datetimes. This will allow users to see when a Dynamic Table record was inserted or modified. The generated code will create a new table to store the metadata, a merge statement to initialize the table, a stream for the root level Dynamic Table to track changes, and a task to keep the table updated on users configured schedule.

Baseline Scan Settings

Users can see the average number of records processed based on a specific thread count calculated from scan. These numbers can be recalculated by resetting the values to 0. The next subsequent scan will update these calculations.

Required Fields

When creating or updating a data source the user can see what fields are required if a specific piece of information has not been supplied.

Scan Configuration

Users can now configure the type of attribute creation a scan will use. The Discover attribute creation type will generate attribute metadata based on a discovery of the data located in the database object configured in the data source. The Schema attribute creation type will generate attribute metadata based on the single sample document configured for the data source. Multiple scan configurations can be created for a data source using both attribute creation types to allow a user to compare a schema against the data in a data source.

Scan Data Source

No changes

Data Source Attributes

Users can now configure whether an attribute is unique, is a primary key, or whether the attribute contains enum values only.

Users can now modify the status of an attribute from active to inactive. Inactive attributes are not included in the code generation process. Users can now also see the difference between attributes discovered in the data source vs attributes created from the data source’s sample document.

The Data Governance tab has been renamed to “Semantic Layer - Security Policy”

Arrays

Users can now configure relationship information for each nested array including:

Relationship Name
Relationship Description
Relationship Type
Join Type

This relationship information will be used as part of the semantic model code generation process.

Data Source SQL Code Generation

If a user has configured a data source to create Dynamic Table metadata the code will be generated to allow for the creation of the necessary database objects to maintain metadata that will track the insert and last update datetimes of each row in the root level Dynamic Table.

Data Dictionary Builder

The data dictionary/glossary/synonym builder uses Cortex AI to assist users in building a data dictionary for each data source. Descriptions can be created for the data source. Descriptions and synonyms can be created for each nested array and descriptions, synonyms, and sample values can be created for each attribute. Users can choose which LLM model they want to use to generate responses.

Semantic Model Code Generation

Users can create semantic model yaml files for use with Cortex Analyst. Information generated using the data dictionary builder will be included in the generation of all attribute descriptions and synonyms.

Users can choose which attributes to include and can configure additional properties such as:

W Question Category (Dimension, Time Dimension, Facts, Metric, Filter)
Sample Value (multiple comma separated values)
Enum Values (designates that the attribute contains only enum values)
Semantic Model Description (in addition to the glossary definition)
Semantic Model Expression
Cortex Search Service Name
Cortex Search Database
Cortex Search Schema
Cortex Search Column

Additional sections generated include:

Relationships
Verified Queries
Custom Instructions

Worksheet Commands

No Changes

DataPancake V1.38 Release Notes

Manage Data Source

Default Refresh_Mode to Incremental

Defaulted the Refresh_Mode to Incremental for all new data sources with a product tier of Extract, Relate, and Flatten.

Support for New Data Formats

Added support for new data formats including Avro, Parquet, ORC, and XML.

Case Sensitivity Checkbox

Added new checkbox to configure whether generated dynamic tables and views enable case sensitivity for column names.

Kafka Deduplication Expression

For data sources that are flattening data from a Kafka topic, the unique identifier has been replaced with a deduplication expression allowing users to configure the sql used to deduplicate kafka messages.

Manage Scan Configuration

No Changes

Scan Data Source

Datetime Formats

Bug Fix: Updated datetime formats for time zone offsets. Previously if a string included a plus sign followed by the offset value, or the time followed by a “Z” representing UTC, the format string used in the generated code would be incorrectly identified. This issue has been resolved.

Support for XML in Variant Columns

Pancake now supports the scanning and discovery of XML data stored in a Variant Snowflake column.

Data Source Attributes

Foreign Keys in Row Access Policy

In the foreign keys configuration the user now has the option to include the foreign key in the Row Access Policy.

Virtual Attributes with Custom SQL

Virtual Attributes: The ability to create a virtual attribute with a custom SQL expression is now available. SQL expressions can also refer to other flattened columns and can be added as foreign keys to nested arrays.

Data Source SQL Generation

Dynamic Tables for XML in Variant Columns

DataPancake now supports the generation of dynamic tables to extract, related, and flatten XML data stored in a Variant column.

Kafka Metadata Flattening

If the ‘Include Stream Message Metadata’ checkbox is checked then Kafka metadata will be flattened and included in the root dynamic table as well all nested dynamic tables for each array. This data will be available in all views generated as part of the data governance semantic layer.

Case Sensitivity for Column Names

If the ‘Use Case Sensitivity’ checkbox is checked then each column generated for dynamic tables and views will be enclosed with a double quote. This will allow for any special characters used as part of the column name.

Bug Fix: Exclude Unchecked Arrays

Bug Fix: If the ‘Include in Code Gen’ checkbox is unchecked for any array the corresponding view will no longer be generated.

Custom Deduplication SQL for Kafka

For data sources that are flattening a Kafa Topic the deduplication sql will now use the new data source property called ‘Deduplication SQL Expression’. This will allow the user to configure the SQL to their specific needs including the choice between using the Rank() or Row_Number() functions.

Root Table Select Prefix

The data source value configured for the ‘Root Dynamic Table Select Prefix’ will now be available in the root level view in the Data Governance Semantic Layer. All columns configured in this value will need to be aliased with an ‘as <alias_name>’.

Foreign Keys in Semantic Layer Views

Foreign keys configured for arrays will now be added to the Row Access Policy definitions in the Data Governance Semantic Layer views if the user configured the foreign key should be added to the Row Access Policy.

Virtual Attributes in Code Generation

Virtual attributes are now included in the code generation of dynamic tables and views. Virtual attributes are defined after the discovered attributes so that virtual attribute SQL expressions can refer to flattened columns created from discovered attributes.