Large JSON File Data Loader
Load large JSON files with a DataPancake chunking UDTF.
1. Upload your JSON file to a stage in Snowflake
2. Open a new Snowflake worksheet
3. Execute the following statement to create a new table
Replace the database, schema, and table name placeholders.
CREATE OR REPLACE TABLE <database_name>.<schema_name>.<table_name>(
file_path STRING NOT NULL, -- path in stage where JSON was read
file_hash VARCHAR(64) NOT NULL, -- SHA-256 hex digest of the full file
chunk_index INTEGER NOT NULL, -- zero-based index of this chunk
chunk_count INTEGER NOT NULL, -- total number of chunks for this file
chunk_json VARIANT NOT NULL, -- parsed JSON fragment
primary_key STRING , -- primary key
processed_at TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP() -- load time
)
COMMENT = 'Stores JSON chunks emitted by DataPancake UDTF: FN_CHUNK_JSON_FROM_STAGED_FILE_UDTF';
4. Execute the following statement to process the XML file
Replace the database, schema, table, stage, filename, header element name, chunk element name, and max_chunk_size_in_mb placeholders.
INSERT INTO <database_name>.<schema_name>.<table_name>
(file_path, file_hash, chunk_index, chunk_count, chunk_json, primary_key)
SELECT c.file_path, c.file_hash, cast(c.chunk_index as integer), cast(c.chunk_count as integer), parse_json(c.chunk_json), c.primary_key
FROM TABLE(
fn_chunk_json_from_staged_file_udtf(
BUILD_SCOPED_FILE_URL('@<stage_name>', '<file_name>'),
'@<stage_name>/<file_name>',
'<start_path_or_empty_string_for_root>',
'chunk_path',
'primary_key_path_or_expression',
cast(<max_chunk_mb> as float)
)
) as c;
5. Verify the statements executed successfully
Last updated