Basic Configuration Settings

Core settings for data source identification, status, and metadata.

Overview

Basic configuration settings define the fundamental properties of a data source: its name, type, operational status, and organizational tags. These settings are required for all data sources and determine how the data source appears and functions within DataPancake.


Data Source Name

A unique, descriptive identifier for the data source within your DataPancake instance.

Requirements

  • Required Field — Must be provided to save the data source

  • Unique — Cannot duplicate an existing data source name (case-insensitive)

  • Descriptive — Should clearly identify the data source's purpose and content

Examples:

  • PRODUCTION.MEDICAL_DEVICE.UDI_JSON

  • ANALYTICS.CUSTOMER_EVENTS

  • STAGING.API_RESPONSES_JSON

  • WAREHOUSE.ORDERS_TABLE

Validation:

  • Checks for uniqueness (case-insensitive comparison)

  • Returns error if duplicate name exists

  • Name cannot be empty or whitespace only


Data Source Type

Classification of the data source as either Semi-Structured or Structured.

Options

Semi-Structured (Recommended for JSON/VARIANT data)

  • For data stored in VARIANT columns

  • Enables full schema discovery and materialization

  • Supports nested objects and arrays

  • Requires format type and column specification

  • Enables dynamic table generation

Structured

  • For standard relational database objects

  • Basic schema discovery only

  • No materialization features

  • No format or column specification needed

  • Limited to flat relational structures

Selection Impact

The data source type determines:

  • Available Configuration Options — Semi-structured sources have more settings

  • Materialization Capabilities — Only semi-structured sources can generate dynamic tables

  • Schema Discovery Depth — Semi-structured sources discover nested structures

  • Required Fields — Semi-structured requires format type and column name

Best Practice: Choose Semi-Structured for VARIANT column data to enable full DataPancake capabilities.

For detailed information on data source types, see Data Source Types.


Status

Operational state of the data source within DataPancake.

Options

Active (Default)

  • Data source is operational and available for scanning

  • Appears in data source lists and dropdowns

  • Can be used in scan configurations

  • Scans can be executed against this data source

  • Use for all production and active development data sources

Inactive

  • Data source is preserved but not available for scanning

  • Hidden from most data source selection lists

  • Configuration is retained for future use

  • Useful for temporarily disabling data sources

  • Can be reactivated by changing status back to Active

Deleted

  • Data source is marked for deletion

  • Configuration may be retained for audit purposes

  • Typically hidden from user interfaces

  • Use when data source is no longer needed

  • May be permanently removed in future cleanup operations

Impact on Operations:

  • Only Active data sources appear in scan configuration dropdowns

  • Only Active data sources can be scanned

  • Inactive/Deleted data sources retain their configuration and scan history

  • Status changes are logged for audit purposes


Data Source Tags

Optional metadata tags for organizing and categorizing data sources.

Purpose

Tags provide flexible organization and filtering capabilities:

  • Categorization — Group related data sources

  • Environment Identification — Tag as PROD, DEV, TEST, etc.

  • Department/Team — Identify data source ownership

  • Data Classification — Mark sensitive or public data

  • Custom Organization — Any organizational scheme

Usage

  • Optional Field — Not required to save data source

  • Free Text — Any text value accepted

  • No Validation — System does not enforce tag format

  • Search/Filter — Can be used for filtering in views and reports

Examples:

  • PROD, CUSTOMER_DATA, PII

  • DEV; TEST_DATA; API_RESPONSES

  • ANALYTICS_TEAM, PUBLIC, EVENTS


Connection Status

Read-only indicator showing whether DataPancake can successfully connect to the source object.

Status Values

Connected

  • DataPancake has successfully validated connection to source object

  • Required privileges are granted

  • Data source is ready for scanning

  • Connection validated on save or periodic checks

Connection Failed

  • DataPancake cannot access the source object

  • Missing required privileges

  • Object may not exist or be inaccessible

  • Error message provides specific failure reason

Connection Validation

Connection status is validated when:

  • Data source is first created

  • Data source settings are saved

  • Periodic background validation (if enabled)

Troubleshooting Connection Failures

When connection fails, the system provides specific error messages and SQL statements to resolve privilege issues.

For Standard Databases:

For Shared Databases:

Common Issues:

  • Missing USAGE privilege on database or schema

  • Missing SELECT privilege on object

  • Object does not exist or was renamed

  • Object is in a shared database requiring IMPORTED PRIVILEGES

For detailed information on privileges, see Adding Data Sources.


Field Requirements Summary

Required Fields

All data sources require:

  • Data Source Name — Unique identifier

  • Data Source Type — Semi-Structured or Structured

  • Status — Active, Inactive, or Deleted

  • Object Type — Table, View, External Table, etc.

  • Database — Source database name

  • Schema — Source schema name

  • Object Name — Source object name

Additional Required for Semi-Structured

  • Format Type — JSON, Avro, Parquet, ORC, or XML

  • Column Data Type — VARIANT or String

  • Column Name — Name of VARIANT/String column

Optional Fields

  • Data Source Tags — Organizational metadata

  • Schema Sample — Sample JSON/XML for faster discovery (semi-structured)


For information on adding data sources, see Adding Data Sources. For product tier configuration, see Product Tiers & Features.

Last updated

Was this helpful?