DataPancake™ Docs
  • DataPancake™ Documentation
  • Overview
    • What is DataPancake™?
    • Understanding the Challenges of Semi-Structured Data
  • Getting Started
    • Quick Start Guides
      • Real Estate JSON Quick Start
      • FDA_UDI Data Quick Start
      • Internal Data Quick Start
      • Alerts
      • Page
    • Application Overview
      • Data Source Overview
      • Manage Data Source
      • Manage Scan Configuration
      • Scan Data Source
      • Data Source Attributes
      • Dynamic Table SQL
      • Manage Warehouse
  • How to Guides
    • How to Purchase & Install DataPancake
    • How to Manage Access to DataPancake
    • How to Grant DataPancake Account & Cortex AI Privileges
    • How to Add Warehouses to DataPancake
    • How to Create a Single DataPancake Data Source (Script Builder)
  • Topics
    • Warehouses
    • Data Sources
    • Scan Configurations
    • Attributes
    • Scan Processing
    • Views
    • Attribute Consolidation
    • Pricing
  • DEVELOPMENT
    • Release Notes
Powered by GitBook
On this page
  • Data Source Settings
  • Product Tier
  • Source Stream Settings
  • Data Object Settings
  • Updating a Data Source
  • Dynamic Table Settings
  • Schema Consolidation
  1. Getting Started
  2. Application Overview

Manage Data Source

Last updated 7 months ago

From the Manage Data Source page, users can view and modify information about data sources in Pancake.

Note that, while you can add a data source to Pancake through the Manage Data Source Page, you will still need to grant usage through a worksheet on every data source you wish to make available to the app with statements similar to the following:

GRANT USAGE ON DATABASE <database_name> TO APPLICATION PANCAKE;
GRANT USAGE ON SCHEMA <database_name>.<schema_name> TO APPLICATION PANCAKE;

If you are editing an existing data source, make a selection from the "Select a data source to edit" dropdown. If you are adding a new data source, begin filling out the Data Source Settings section.

Data Source Settings

When adding data sources from a worksheet, Pancake uses a default naming convention consisting of <database>.<schema>.<json_object>, users are able to edit the name in the Data Source Name field.

The Status field indicates the status of the data source object in Pancake, not the status of the data source in your environement. It can be set to Active, Inactive, or Deleted. If you wish to "remove" a data source from Pancake be sure to set this to Deleted.

Users can add whatever tags they like for a data source in the Tags field.

Connection Status indicates the current state of data source availability.

Product Tier

The Product Tier section will only display the Change Product Tier button initially, meaning the currently selected data source is on the Schema Summary tier. Click the Change Product Tier button to upgrade the data source to Schema Summary or Dynamic Table Generation.

Note that Schema Analysis and Dynamic Table Generation are not selectable unless you have upgraded the app though the Pancake listing in the Snowflake Marketplace.

Source Stream Settings

If the data source your are adding is a streaming source, select the platform from the Source Stream Platform dropdown.

Use the checkbox to decide if you want to deduplicated messages coming in.

The Unique Identifier Attribute is the root level unique attribute(s) for the JSON object. This field is required if your data source is on the Extract, Relate, and Flatten product tier. Pancake provides an example of a source stream identifier attribute.

Source Stream Platform:

If the data in this object is coming from a streaming platform select a stream platform from the dropdown list.

Deduplicate Messages:

If checked, the generated SQL for the root dynamic table will include window function to only return the most recent message for each primary key value.

Unique Identifier Attribute:

The root level unique attribute(s) for the JSON object. The value is used in the construction of the root level Dynamic Table and is required if the selected product tier is Extract, Relate, and Flatten.

Column Name:

The column name for the VARIANT column in the Snowflake object (table, view, or external table).

Data Object Settings

In the Data Object Setting section you must set the data source object type, format type, and input the database name, schema name, object name, and column name before you can add the source to Pancake.

If you have granted usage to Pancake, available databases, schemas, and objects will be available in the dropdown menus.

Updating a Data Source

After selecting an existing data source from the dropdown, all the fields on this page will be filled out. These fields are user editable, but be aware that altering data source or object settings could impact the connection status of the data source.

Dynamic Table Settings

The Dynamic Table Settings section allows you to configure details that will be included in the Dynamic Table SQL generation step for data sources on the Extract, Relate, and Flatten product tier.

Dynamic Table Prefix Name

The prefix name used in the dynamic table SQL generation process . Required only if you plan to generate dynamic tables for this data source. This prefix value will be used as the root dynamic table name and as a prefix for all other nested Dynamic Tables created for array attributes. You do not need to include an underscore at the end of this name. An underscore will be added automatically for all nested Dynamic Tables.

Dynamic Table Warehouse Name

The name of warehouse that will be assigned to the dynamic tables which is a part of the CREATE dynamic table statement. This is not the warehouse you use to execute the CREATE dynamic table statements.

Deployment Database Name

The name of the database the Dynamic Tables will be deployed to.

Deployment Schema Name

The name of the schema the Dynamic Tables will be deployed to.

Root Dynamic Table Row Level Policy Name

The name of the row level policy used for the root level dynamic table. Choose which root level attributes you want to include in the row level policy through the Data Attributes screen.

Root Dynamic Table Select Prefix

The select clause to include in the root dynamic table select statement. This is used to include other non-variant columns from the datasource's table in the root level dynamic table.

Dynamic Table Target Lag Unit Type

The frequency interval for the dynamic table refresh. This value will be included in the CREATE dynamic table statement.

Dynamic Table Target Lag Quantity

The frequency interval quantity for the dynamic table refresh. This value will be included in the CREATE dynamic table statement.

Root Dynamic Table Optional Parameters

Optional parameters for the root level dynamic table. This value will be included in the CREATE dynamic table statement. See https://docs.snowflake.com/en/sql-reference/sql/create-dynamic-table for additional information.

Nested Dynamic Table Optional Parameters

Optional parameters for the flattened dynamic tables created from nested arrays. This value will be included in the CREATE dynamic table statement. See https://docs.snowflake.com/en/sql-reference/sql/create-dynamic-table for additional information.

Schema Consolidation

Schema Filter

You can set the naming scheme for the deployed, tables, set row level policies, target lag and unit time, and can include any optional parameters using the .

appropriate Snowflake syntax