Data Source Overview

The Data Source Overview page in Pancake gives you a high level overview of all the data sources you have added to the app, as well as any summary information for data sources you've scanned. The core feature of this page is the Data Source table and the filter for it.

If this is your first time launching Pancake and you have just installed the app, in the upper right corner you will see a small icon for the Readme. This contains several Quick Start scripts designed to configure the app for use, there are global permissions and warehouse provisioning which are necessary for Pancake to function.

The Data Sources table contains some basic information including:

  • Name - the name the user created for the data source when it was created, which likely reflects some combination of the database, schema, environment, and column, but note that it does not require explicitly referencing any of those elements.

  • Type - the Snowflake data object type such as Table, External Table, or View.

  • Object Name - the name of the Snowflake object which contains the JSON data.

The Data Source Overview page also contains key pieces of information about each data source added to Pancake. This level of detail is called a Schema Summary and is free for all users of Pancake. You can add an unlimited number of data sources, scan them, and get this level of detail about your data for no charge.

This includes:

  • Total Attributes - the total number of attributes discovered during scanning.

  • Polymorphic - the number of polymorphic attributes in the data source.

  • Arrays - total number of arrays, inclusive of polymorphic attributes.

  • Objects - total number of objects in the data source.

  • Max Level - the maximum nested depth of the arrays.

  • Complexity Score - a measure of how difficult a given data source would be to work with manually, this is a rough measure designed to help users quickly understand the quality of their data and triage their work.

Other important information on this page includes:

  • Connection Status - this indicates if a data source has been connected to Pancake successfully. Data sources which are not able to connect should fail when users attempt to add the data source.

  • Last Scan - the date of the most recent scan of this data source. Note that this is the most recent scan using any Scan Configuration.

  • Column Name - this is the actual name of the VARIANT column in which the JSON data is stored.

  • Tags - this holds any tags a user has added to a data source.

  • Product Tier - the current product tier for a given data source, which can be Schema Summary, Schema Analysis, or Dynamic Table Generation.

  • Status - indicates the current status of the data source, which can be Active, Inactive, or Deleted.

Last updated