JSON Tutorial

This detailed tutorial will walk you through creating and configuring a JSON DataPancake pipeline using an example dataset.

How to use this tutorial:

Reference the screenshots and code blocks for the input values you should use as you follow along.

You can open the linked guides in new tabs to easily return to this tutorial for the next steps in the process.

Deploy Data Objects & Load Data

Create the table and load the data which can be downloaded here:

12MB

pharma_manufacturing_demo_json_data.json

Open

Additionally, copy and deploy the UDFs and security policies.

Note: The Enterprise version of DataPancake is required to use the security policies used in this tutorial.

Create the Data Source

Use the Script Builder to generate the initial source.

Make sure to check "Start Scan", this will discover the raw schema and attributes.

Finalize the Schema & Rescan

Add schema transformations.

Re-scan the data source with reset attributes = true.

Note: After manual metadata edits, reset attributes is no longer available.

Define Virtual Attributes

Create the new virtual attributes

This is where you can add surrogate primary keys, create calculated fields, and add ordering attributes for dedupe logic.

Configure Column Materialization

Add column materialization rules in the dynamic table layer and the secure view layer.

Apply Column-Level Schema Transformations

Apply column-level schema transformations.

You can also normalize data types and formats, and perform schema consolidation at the field level.

Merge Polymorphic Versions

Merge string, float, and integer variants into unified attributes.

Add Aliases

Add user-friendly alias names for columns and arrays where needed.

Configure Array Relationships [Coming Soon]

Define parent/child relationships

Note: this is required for semantic model generation.

Add Security Policies

Configure security policies such as row-level access rules, and attribute-level security tags.

Configure Foreign Keys

Add relationships between flattened entities by configuring foreign keys.

Apply Final Metadata Modifications [Coming Soon]

Generate Code

Generate the dynamic SQL statements using your latest configurations.

Deploy and Validate

Review and deploy the generated code.

Make sure to perform data quality checks and validate security enforcement.

Pipeline Maintenance

Repeat steps 3 - 13 as many times as needed.

PreviousGenerated SQL DDL Deployment Options NextHow to Guides (UI)

Last updated 2 months ago

Was this helpful?

Good evening

hashtagDeploy Data Objects & Load Data

hashtagCreate the Data Source

hashtagFinalize the Schema & Rescan

hashtagDefine Virtual Attributes

hashtagConfigure Column Materialization

hashtagApply Column-Level Schema Transformations

hashtagMerge Polymorphic Versions

hashtagAdd Aliases

hashtagConfigure Array Relationships [Coming Soon]

hashtagAdd Security Policies

hashtagConfigure Foreign Keys

hashtagApply Final Metadata Modifications [Coming Soon]

hashtagGenerate Code

hashtagDeploy and Validate

hashtagPipeline Maintenance