Data Source Attributes
Last updated
Last updated
In Pancake, the Data Source Attributes page displays a table containing all the discovered attributes found during the scanning of a data source. The table has a number of columns, designed to give users depth of understanding about their data sources, as well as exposing a number of editable fields to validate or edit if necessary ahead of the Dynamic Table SQL generation process.
The Data Source Attribute page is also where users will identify and assign array foreign keys, which is necessary for creating truly relational data when the Dynamic Table SQL is generated. This page will explain each column in the Attribute table, and provide guidance and context about how to navigate the process of selecting foreign keys.
Note that if a data source is on the Schema Summary product tier, only 3 attributes will be visible. If you are not able to see the complete schema for your data source, ensure that the data source has been upgraded to the Schema Analysis product tier.
Performing a deep, recursive scan of every record in a JSON source is a core component of Pancake's functionality, the output of that process is a complete list of every attribute which has ever occurred at any level of nesting depth within that data source. This includes attributes which contain embedded or stringified JSON (when an entire JSON blob has been put in a field using double quotes), as well polymorphic data (when values in a single column have different data types).
The Attributes table contains the following columns:
Path - The name of the attribute inclusive of the entire path within the structure of the data source.
Name - The attribute name.
Polymorphic Name - The attribute name created by Pancake which appends the data type to the name.
Alias - Users can edit this field to create an alias for a given attribute which will be used when the Dynamic Table SQL code is generated as part of the CREATE DYNAMIC TABLE
statements.
Level - The nested depth of the attribute.
Data Type - The inferred attribute data type.
Has Embedded JSON - Tickbox used to indicate if the field contains embedded or stringified JSON.
Sample Value - For attributes which are columns that contain values, this field will contain a sample of the data. To ensure data security and privacy, columns which contain a string are defaulted to "string value."
Array Type - Indicates if the attribute is an object array or a primitive array.
Array Prim Data Type - For primitive arrays, indicates the data type.
Snowflake Data Type - Inferred Snowflake Data Type, this field is editable so users may change the inferred data type if e.g. they know the source has data they wish to flatten as a VARCHAR
rather than a NUMBER
, etc.
Precision - Inferred precision of numerical data based on Pancake's scan. Users may edit this field if they know the data will contain values with precision or scale outside of the inferred values.
Scale - Inferred scale of numerical data based on Pancake's scan. Users may edit this field if they know the data will contain values with precision or scale outside of the inferred values.
DateTime Format - Inferred datetime format(s) of an attribute. Users can edit this field if they know a given column has a specific datetime format other than what Pancake has inferred.
Null Value Expression - Any valid SQL expression. If this value is not supplied the default value is NULL. Use {attribute_name}
as a placeholder in your expression to refer to the polymorphic version of the attribute. This value is not applicable if you have selected SQL Expression as the transformation type.
Tranformation Type - Type of transformation applied to the Dynamic Table column.
Transformation Expression - SQL expression used for the DYNAMIC TABLE
column definition if the transformation type equals expression. Use {attribute_name}
as a placeholder in your expression to refer to the polymorphic version of the attribute.
Foreign Keys - All foreign keys currently set for that array.
Date Active - Date the attribute was first discovered or created.
Attribute Type - Attributes disovered by Pancake will be defined as "Discovered," custom columns created by users as part of onboarding data using Pancake will be defined as "Virtual."
Primary and foreign keys are an important concept in Pancake, and understanding their place in the process is what will allow you to create truly relational data.