Datasources - Bulk Update
Select and export the datasource metadata to a .csv file, edit in a spreadsheet like Google Sheets or Excel, then perform a bulk update in DataPancake.
Ensure you have completed Initialization Guide
1. Open a new Snowflake worksheet

2. Run a select statement for DATAPANCAKE.CORE.VW_DATASOURCES
SELECT
-- VIEW ONLY COLUMNS
DATASOURCE_ID,
DATASOURCE_TYPE,
DATA_FORMAT,
DATASOURCE_OBJECT_TYPE,
DATASOURCE_OBJECT_ATTRIBUTE_DATA_TYPE,
DATASOURCE_DATABASE,
DATASOURCE_SCHEMA,
DATASOURCE_OBJECT_NAME,
DATASOURCE_OBJECT_ATTRIBUTE_NAME,
LAST_SCAN,
ATTRIBUTE_COUNT,
ATTRIBUTE_MAX_LEVEL,
OBJECT_COUNT,
ARRAY_COUNT,
POLYMORPHIC_ATTRIBUTE_COUNT,
EMBEDDED_CONTENT_ATTRIBUTE_COUNT,
COMPLEXITY_SCORE,
DATASOURCE_STATUS,
CONNECTION_STATUS,
VIRTUAL_ATTRIBUTE_COUNT,
SCHEMA_ATTRIBUTE_COUNT,
DISCOVERED_ATTRIBUTE_COUNT,
-- PRODUCT TIER FLAGS
PRODUCT_TIER_ATTRIBUTE_METADATA_MANAGEMENT,
PRODUCT_TIER_SQL_CODE_GENERATION,
PRODUCT_TIER_DATA_DICTIONARY_BUILDER,
PRODUCT_TIER_SEMANTIC_MODEL_CODE_GENERATION,
PRODUCT_TIER_SECURITY_POLICY_INTEGRATION,
-- EDITABLE COLUMNS FOR BULK UPDATE
DATASOURCE_NAME,
DATASOURCE_TAG,
-- OUTPUT CONFIGURATION FIELDS
OUTPUT_OBJECT_TYPE,
OUTPUT_OBJECT_PREFIX_NAME,
OUTPUT_OBJECT_COMPUTE_RESOURCE_NAME,
OUTPUT_OBJECT_INTERVAL_QUANTITY,
OUTPUT_OBJECT_INTERVAL_UNIT_TYPE,
OUTPUT_OBJECT_OPTIONAL_PARAMETERS,
OUTPUT_OBJECT_OPTIONAL_PARAMETERS_ARRAY,
OUTPUT_OBJECT_DEDUPLICATE,
OUTPUT_OBJECT_SECURITY_ROW_LEVEL_POLICY_NAME_SEMANTIC_LAYER,
OUTPUT_OBJECT_DATABASE_NAME,
OUTPUT_OBJECT_SCHEMA_NAME,
OUTPUT_OBJECT_ROOT_SELECT_PREFIX,
OUTPUT_OBJECT_CASE_SENSITIVE,
-- SCAN CONFIGURATION FIELDS
BASE_SCAN_COMPUTE_RESOURCE_THREAD_COUNT,
BASE_SCAN_RECORDS_PER_MINUTE,
-- METADATA TABLE CONFIGURATION FIELDS
OUTPUT_OBJECT_CREATE_METADATA_TABLE,
OUTPUT_OBJECT_METADATA_TABLE_ALLOW_UPDATE,
OUTPUT_OBJECT_METADATA_TABLE_ALLOW_DELETE,
OUTPUT_OBJECT_METADATA_TABLE_COMPUTE_RESOURCE_NAME,
OUTPUT_OBJECT_METADATA_TABLE_CRON_SCHEDULE,
OUTPUT_OBJECT_METADATA_TABLE_CRON_TIMEZONE,
OUTPUT_OBJECT_METADATA_TABLE_DATABASE_NAME,
OUTPUT_OBJECT_METADATA_TABLE_SCHEMA_NAME,
-- DATA DICTIONARY
DATASOURCE_DESCRIPTION_PROMPT,
DATASOURCE_DESCRIPTION,
-- DERIVED COLUMNS FOR BULK UPDATE
FALSE AS GENERATE_CODE,
FALSE AS GENERATE_DATASOURCE_DESCRIPTION,
'mixtral-8x7b' AS MODEL_NAME,
250 AS WORD_LIMIT,
FROM DATAPANCAKE.CORE.VW_DATASOURCES
ORDER BY DATASOURCE_NAME;
3. Verify the query successfully returns a list of attribute metadata
4. Download the results as a .csv file

5. Open the .csv file in either Excel or Google Sheets

6. Edit any editable columns as needed

Bulk Cortex AI Data Source Description Generation
Use TRUE for GENERATE_CODE and DataPancake will automatically generate code for the datasource.
Use GENERATE_DATASOURCE_DESCRIPTION and DataPancake will automatically generate a description from the datasource
You can also modify the MODEL_NAME, and WORD_LIMITcolumns to further improve the LLM values returned.
7. Save (Excel) or download (Google Sheets) the file

8. Load the .csv into your stage




9. Run the bulk update stored procedure
When the stored procedure completes, the attribute metadata file will be moved to the archive folder.
call <database>.<schema>.sp_update_datapancake_datasource_metadata_core(
'@<stage_name>/<staged_filename>.csv',
'@<stage_name>/<archive_folder>/',
'<snowflake_username>'
);10. Verify the integration completed successfully by running the following query
select * from datapancake.core.vw_integration_process
where user_name = 'BWOLFF'
order by integration_process_date desc;Last updated
Was this helpful?