JSON Tutorial

This detailed tutorial will walk you creating and configuring a JSON DataPancake pipeline using an example data set.

How to use this tutorial:

Reference the screenshots and code blocks for the input values you should use as you follow along.

You can open the linked guides in new tabs to easily return to this tutorial for the next steps in the process.

1

Deploy Data Objects & Load Data

Create the table and load the data.

Copy and deploy the UDFs and security policies.

2

Create the Data Source

Use the Script Builder to generate the initial source.

Make sure to check "run scan", this will discover the raw schema and attributes.

3

Finalize the Schema & Rescan

Add schema transformations.

Re-scan the data source with reset attributes = true.

Note: After manual metadata edits, reset attributes is no longer available.

4

Define Virtual Attributes

Create the new virtual attributes

This is where you can add surrogate primary keys, create calculated fields, and add ordering attributes for dedupe logic.

5

Configure Column Materialization

Add column materialization rules in the dynamic table layer and the secure view layer.

6

Apply Column-Level Schema Transformations

Apply column-level schema transformations.

You can also normalize data types and formats, and perform schema consolidation at the field level.

7

Merge Polymorphic Versions

Merge string, float, and integer variants into unified attributes.

9

Configure Array Relationships [Coming Soon]

Define parent/child relationships

Note: this is required for semantic model generation.

10

Add Security Policies

Configure security policies such as row-level access rules, and attribute-level security tags.

11

Configure Foreign Keys

Add relationships between flattened entities by configuring foreign keys.

12

Apply Final Metadata Modifications

For example, you can add attributes to row access policies.

13

Generate Code

Generate the dynamic SQL statements using your latest configurations.

14

Deploy and Validate

Review and deploy the generated code.

Make sure to perform data quality checks and validate security enforcement.

Pipeline Maintenance

Repeat steps 3 - 13 as many times as needed.

Last updated

Was this helpful?