Scan Configurations

Introduction to scan configurations - learn and control how DataPancake scans your data.

Overview

Scan configurations control how DataPancake scans data sources. Each configuration defines compute resources, scanning strategy, scheduling, and attribute discovery method.

Configuration Sections

Quick Reference

Essential:

  • CONFIGURATION_NAME - Unique identifier

  • ATTRIBUTE_CREATE_TYPE - 'Discover' (production) or 'Schema' (prototyping)

  • Virtual Warehouse - Compute resource

  • SCAN_RECORD_LIMIT - Number of records (0 = unlimited)

Advanced:

  • Number of Threads - Parallel processing (defaults to warehouse max, semi-structured only)

  • PROCEDURE_INSTANCE_COUNT - Split scans across multiple calls (60-minute timeout)

  • MONITOR_CRON_SCHEDULE - Automated scanning with cron expressions

See Scan Processing for scanning details. See Warehouses for warehouse selection.

Last updated

Was this helpful?