Baseline Scan Settings

Performance calibration metrics for estimating scan duration and progress.

Overview

Baseline Scan Settings store performance metrics from previous scans to estimate future scan duration and track progress. These settings are automatically populated after the first successful scan with at least 10,000 records using an X-Small warehouse (or specified warehouse), and are used by scan configurations to calculate estimated completion times and progress percentages.


Base Records Processed Per Minute

The number of records processed per minute during the baseline scan.

Purpose

  • Performance Metric - Measures scanning throughput

  • Estimation - Used to estimate future scan durations

  • Progress Tracking - Calculates scan progress percentage

  • Calibration - Baseline for performance comparisons

How It's Set

Automatic Population:

  • Set after first successful scan with 10,000+ records

  • Calculated from actual scan performance

  • Uses X-Small warehouse as baseline

  • Stored in data source configuration

Calculation:

  • Total records scanned / scan duration in minutes

  • Example: 10,000 records in 2 minutes = 5,000 records/minute

Usage

Scan Estimation:

  • Used by scan configurations to estimate duration

  • Formula: estimated_minutes = record_limit / records_per_minute

  • Provides users with expected scan time

Progress Tracking:

  • Calculates percentage complete during scans

  • Formula: progress = (records_processed / total_records) * 100

  • Updates in real-time during scan execution


Base Thread Count

The number of threads used during the baseline scan.

Purpose

  • Resource Metric - Records compute resources used

  • Calibration - Baseline for thread count comparisons

  • Estimation - Used in performance calculations

  • Reference - Helps configure future scans

How It's Set

Automatic Population:

  • Set after first successful scan with 10,000+ records

  • Uses actual thread count from baseline scan

  • Typically 8 threads for X-Small warehouse

  • Stored in data source configuration

Default Value:

  • Defaults to 8 threads if not set

  • Can be manually adjusted if needed

  • Should match warehouse thread capacity

Usage

Performance Reference:

  • Reference for configuring scan thread counts

  • Helps determine optimal thread configuration

  • Used in performance estimation calculations


Setting Baseline Values

Automatic Setting

Requirements:

  • Scan with at least 10,000 records

  • Use X-Small warehouse (or specified warehouse)

  • Successful scan completion

  • System automatically calculates and stores values

1

Perform a baseline scan

Perform a scan with 10,000+ records using the specified warehouse.

2

System calculates records per minute

The system computes records per minute from the scan duration and total records scanned.

3

System records thread count

The system records the actual thread count used during the baseline scan.

4

Values stored in data source configuration

Calculated values (records/minute and thread count) are saved to the data source configuration and become available for future scan estimations.

Manual Setting

When to Manually Set:

  • Baseline values are inaccurate

  • Performance has changed significantly

  • Want to use different baseline

  • Testing different configurations

How to Set:

  • Edit data source configuration

  • Enter values in Baseline Scan Settings

  • Save data source

  • Values used for future estimations


Resetting Baseline Settings

Reset to Zero

Purpose:

  • Clear existing baseline values

  • Force recalculation on next scan

  • Reset to default values

  • Start fresh baseline calibration

When to Reset:

  • Baseline values are inaccurate

  • Performance characteristics changed

  • Warehouse configuration changed

  • Want to recalibrate

How to Reset:

  • Click "Reset Baseline Scan Settings" button

  • Values set to 0

  • Next scan with 10,000+ records will set new baseline

  • System recalculates from new scan

Default Values

When Baseline is Zero:

  • System uses default values for estimation

  • Default: 10,000 records/minute

  • Default: 8 threads

  • Estimates may be less accurate

After Reset:

  • Next substantial scan sets new baseline

  • System uses actual performance metrics

  • More accurate estimations going forward


Impact on Scan Configurations

Estimation Accuracy

With Baseline:

  • Estimates based on actual performance

  • More accurate duration predictions

  • Better progress tracking

  • Calibrated to your data and warehouse

Without Baseline:

  • Uses default values

  • Less accurate estimations

  • May over/under estimate duration

  • Generic performance assumptions

Progress Tracking

With Baseline:

  • Real-time progress based on actual throughput

  • Accurate percentage calculations

  • Better user experience

  • Reliable progress indicators

Without Baseline:

  • Progress based on defaults

  • May be less accurate

  • Still functional but less precise



Troubleshooting

Inaccurate Estimates — Symptoms & Solutions

Symptoms:

  • Scan duration estimates are way off

  • Progress tracking seems incorrect

  • Baseline values seem wrong

Solutions:

  • Reset baseline settings

  • Perform new baseline scan

  • Verify warehouse configuration

  • Check data complexity hasn't changed

Missing Baseline — Symptoms & Solutions

Symptoms:

  • Baseline values are 0

  • Estimates use defaults

  • Less accurate predictions

Solutions:

  • Perform scan with 10,000+ records

  • Let system set baseline automatically

  • Verify scan completed successfully

  • Check baseline values were set

Performance Changes — Symptoms & Solutions

Symptoms:

  • Baseline no longer accurate

  • Recent scans much faster/slower

  • Estimates consistently wrong

Solutions:

  • Reset baseline settings

  • Perform new baseline scan

  • Review warehouse configuration

  • Check for data structure changes


Summary

Core Settings:

  • Base Records Processed Per Minute (auto-set or manual)

  • Base Thread Count (auto-set or manual)

Automatic Setting:

  • Set after scan with 10,000+ records

  • Uses actual scan performance

  • Calibrated to your data and warehouse

Usage:

  • Scan duration estimation

  • Progress tracking

  • Performance reference

Note: Let system set baseline automatically after first substantial scan (10,000+ records). Reset only when baseline is clearly inaccurate.

For scan configuration, see Scan Configurations documentation. For data source overview, see Overview.

Last updated

Was this helpful?