Core + Data Dictionary Attributes - Bulk Update

Export the attribute metadata to a .csv file, edit in a spreadsheet like Google Sheets or Excel, then use to perform a bulk update on core & data dictionary attribute metadata in DataPancake.

Ensure you have completed Attribute Metadata Bulk Updates (Data Dictionary) - Setup

1. Open a new Snowflake worksheet

2. Run a select statement for DATAPANCAKE.CORE.VW_DATASOURCE_ATTRIBUTE_DATA_DICTIONARY

The minimum field requirements to select are attribute_id, polymorphic_version_id - for example:

select attribute_id, polymorphic_version_id
from datapancake.core.vw_datasource_attribute_data_dictionary
where version_status = 'active';

Select statement for data dictionary updates where source data type is an array

select
-- VIEW ONLY COLUMNS
DATASOURCE_ID,
ATTRIBUTE_ID,
POLYMORPHIC_VERSION_ID,
DATASOURCE_NAME,
DATASOURCE_TAG,
VERSION_NUMBER,
VERSION_STATUS_DATE,
PARENT_OBJECT,
PARENT_ARRAY,
ATTRIBUTE_PATH,
ATTRIBUTE_NAME,
ATTRIBUTE_LEVEL,
ATTRIBUTE_ORDER,
SOURCE_DATA_TYPE,
POLYMORPHIC_ATTRIBUTE_NAME,
ARRAY_TYPE,
ARRAY_PRIMITIVE_TYPE,
-- EDITABLE COLUMNS (Array-specific)
INCLUDE_ARRAY_IN_CODE_GEN,
ARRAY_RELATIONSHIP_NAME,
ARRAY_RELATIONSHIP_DESCRIPTION,
ARRAY_RELATIONSHIP_TYPE,
ARRAY_RELATIONSHIP_JOIN_TYPE,
SEMANTIC_LAYER_ALIAS_NAME,
ARRAY_REQUIRE_FLATTEN,
ATTRIBUTE_VERSION_DESCRIPTION_PROMPT,
ATTRIBUTE_VERSION_DESCRIPTION,
ATTRIBUTE_VERSION_SYNONYMS,
-- DERIVED COLUMNS FOR BULK CORTEX AI GENERATION
GENERATE_DESCRIPTION,
GENERATE_SYNONYMS,
GENERATE_SAMPLE_VALUES,
DESCRIPTION_WORD_LIMIT,
SYNONYM_LIMIT,
SAMPLE_VALUE_LIMIT,
MODEL_NAME,
-- AUDIT COLUMNS
ATTRIBUTE_LAST_UPDATED_DATETIME,
ATTRIBUTE_LAST_UPDATED_BY
from datapancake.core.vw_datasource_attribute_data_dictionary
where version_status = 'active'
and SOURCE_DATA_TYPE = 'array'
order by attribute_path;

Select statement for data dictionary updates where source data type is primitive

select
-- VIEW ONLY COLUMNS
DATASOURCE_ID,
ATTRIBUTE_ID,
POLYMORPHIC_VERSION_ID,
DATASOURCE_NAME,
DATASOURCE_TAG,
VERSION_NUMBER,
VERSION_STATUS_DATE,
PARENT_OBJECT,
PARENT_ARRAY,
ATTRIBUTE_PATH,
ATTRIBUTE_NAME,
ATTRIBUTE_LEVEL,
ATTRIBUTE_ORDER,
SOURCE_DATA_TYPE,
POLYMORPHIC_ATTRIBUTE_NAME,
SAMPLE_VALUE,
HAS_EMBEDDED_CONTENT,
DATA_PLATFORM_DATA_TYPE,
-- EDITABLE COLUMNS (Non-array attributes)
ATTRIBUTE_RECORD_STATUS,
INCLUDE_COLUMN_IN_SECURE_VIEW,
NUMERIC_PRECISION,
NUMERIC_SCALE,
USE_DATETIME_FORMAT,
DATETIME_FORMAT,
ATTRIBUTE_RECORD_STATUS,
IS_UNIQUE,
IS_PRIMARY_KEY,
CONTAINS_ENUM_VALUES,
RAW_TRANSFORMATION_TYPE,
RAW_TRANSFORMATION_EXPRESSION,
RAW_TRANSFORMATION_EXPRESSION_COMMENT,
RAW_NULL_VALUE_EXPRESSION,
SEMANTIC_LAYER_TRANSFORMATION_EXPRESSION,
SEMANTIC_LAYER_TRANSFORMATION_EXPRESSION_COMMENT,
RAW_ALIAS_NAME,
SEMANTIC_LAYER_ALIAS_NAME,
CONSOLIDATION_REGEX_SEARCH,
CONSOLIDATION_SQL_EXPRESSION,
ATTRIBUTE_VERSION_DESCRIPTION_PROMPT,
ATTRIBUTE_VERSION_DESCRIPTION,
ATTRIBUTE_VERSION_SYNONYMS,
ATTRIBUTE_VERSION_SAMPLE_VALUES,
-- DERIVED COLUMNS FOR BULK CORTEX AI GENERATION
FALSE AS GENERATE_DESCRIPTION,
FALSE AS GENERATE_SYNONYMS,
FALSE AS GENERATE_SAMPLE_VALUES,
100 AS DESCRIPTION_WORD_LIMIT,
5 AS SYNONYM_LIMIT,
3 AS SAMPLE_VALUE_LIMIT,
'<ENTER MODEL NAME HERE>' AS MODEL_NAME,
-- AUDIT COLUMNS
ATTRIBUTE_LAST_UPDATED_DATETIME,
ATTRIBUTE_LAST_UPDATED_BY
from datapancake.core.vw_datasource_attribute_data_dictionary
where version_status = 'active'
and SOURCE_DATA_TYPE != 'array'
order by attribute_path;

You can choose which editable fields you wish to include in your select statement.

3. Verify the query successfully returns a list of attribute metadata

4. Download the results as a .csv file

5. Open the .csv file in either Excel or Google Sheets

6. Edit any editable columns as needed

Bulk Cortex AI Data Dictionary Generation

Use TRUE for these attributes and DataPancake will generate new data dictionary values with the following:

GENERATE_DESCRIPTION, GENERATE_SYNONYMS, and GENERATE_SAMPLE_VALUES You can also modify the DESCRIPTION_WORD_LIMIT, SYNONYM_LIMIT, SAMPLE_VALUE_LIMIT, MODEL_NAME, and PROMPT columns to further improve the LLM values returned.

7. Save (Excel) or download (Google Sheets) the file

8. Load the .csv into your stage

9. Run the bulk update stored procedure

When the stored procedure completes, the attribute metadata file will be moved to the archive folder.

call <database>.<schema>.sp_update_datapancake_attribute_metadata_dictionary(
    '@<stage_name>/<staged_filename>.csv',
    '@<stage_name>/<archive_folder>/',
    '<snowflake_username>'
);

10. Verify the integration completed successfully by running the following query

select * from datapancake.core.vw_integration_process
where user_name = 'BWOLFF'
order by integration_process_date desc;

PreviousCore + Security Attributes - Bulk Update NextDatasources - Bulk Update

Last updated 3 months ago

Was this helpful?

Good morning

Ensure you have completed Attribute Metadata Bulk Updates (Data Dictionary) - Setup

1. Open a new Snowflake worksheet

2. Run a select statement for DATAPANCAKE.CORE.VW_DATASOURCE_ATTRIBUTE_DATA_DICTIONARY

Select statement for data dictionary updates where source data type is an array

Select statement for data dictionary updates where source data type is primitive

3. Verify the query successfully returns a list of attribute metadata

4. Download the results as a .csv file

5. Open the .csv file in either Excel or Google Sheets

6. Edit any editable columns as needed

7. Save (Excel) or download (Google Sheets) the file

8. Load the .csv into your stage

9. Run the bulk update stored procedure

10. Verify the integration completed successfully by running the following query