How to Create Multiple DataPancake Data Sources (Script Builder)
Build a script to create and initiate scans for multiple data sources in DataPancake.
Last updated
Build a script to create and initiate scans for multiple data sources in DataPancake.
Last updated
Please ensure the have been completed before creating your first data source.
The default is DATAPANCAKE
If you want to scan all records, set the limit to 0
Add additional filters to the where clause to limit the amount of rows returned.
Additionally, choose which table code blocks are included in step 2 by selecting code blocks in individual rows to create multiple data sources for.
Examples of additional columns that can be added to the where clause are: columns.table_name
, columns.column_name
If the source tables to be scanned store a mix of different semi-structured data format types (such as JSON and XML), then the where clause will need to be modified to return only the tables storing data that match the data format type that was specified.
If the where clause is not used to further limit the result then all VARIANT columns will be returned. In this case, select the code blocks for tables that match the data format type specified.
If a code block is copied and used in the step 2 script for a table that stores JSON data, and XML has been specified for the data format type, the step 2 script will produce an error indicating an invalid data format type and the data source will not be created for that table.
When executing a script for multiple scans, take into consideration the size of the source table and the number of rows being scanned.
If the option to initiate the scan has been selected, then the specified warehouse will call the DataPancake procedure (which creates data sources and starts scans) synchronously for all tables that you choose to add to the step 2 script.
Optionally, change the name of the warehouse specified in the select statement to change the scan warehouse so that sets of procedure calls are executed on different warehouses in the step 2 script enabling the work to be distributed in parallel.
Use the blue dropdown arrow at the top right and click "Run All"
Any errors that occur will be detailed in the call result column.
For example, if you tried to scan XML data sources as JSON, you will see a detailed error message mentioning The data source connection failed... Invalid JSON Object
Fix and rerun any failed statements as needed.