Basic Configuration Settings
Core settings for data source identification, status, and metadata.
Overview
Basic configuration settings define the fundamental properties of a data source: its name, type, operational status, and organizational tags. These settings are required for all data sources and determine how the data source appears and functions within DataPancake.
Data Source Name
A unique, descriptive identifier for the data source within your DataPancake instance.
Requirements
Required Field — Must be provided to save the data source
Unique — Cannot duplicate an existing data source name (case-insensitive)
Descriptive — Should clearly identify the data source's purpose and content
Examples:
PRODUCTION.MEDICAL_DEVICE.UDI_JSONANALYTICS.CUSTOMER_EVENTSSTAGING.API_RESPONSES_JSONWAREHOUSE.ORDERS_TABLE
Validation:
Checks for uniqueness (case-insensitive comparison)
Returns error if duplicate name exists
Name cannot be empty or whitespace only
Data Source Type
Classification of the data source as either Semi-Structured or Structured.
Options
Semi-Structured (Recommended for JSON/VARIANT data)
For data stored in VARIANT columns
Enables full schema discovery and materialization
Supports nested objects and arrays
Requires format type and column specification
Enables dynamic table generation
Structured
For standard relational database objects
Basic schema discovery only
No materialization features
No format or column specification needed
Limited to flat relational structures
Selection Impact
The data source type determines:
Available Configuration Options — Semi-structured sources have more settings
Materialization Capabilities — Only semi-structured sources can generate dynamic tables
Schema Discovery Depth — Semi-structured sources discover nested structures
Required Fields — Semi-structured requires format type and column name
Best Practice: Choose Semi-Structured for VARIANT column data to enable full DataPancake capabilities.
For detailed information on data source types, see Data Source Types.
Status
Operational state of the data source within DataPancake.
Options
Active (Default)
Data source is operational and available for scanning
Appears in data source lists and dropdowns
Can be used in scan configurations
Scans can be executed against this data source
Use for all production and active development data sources
Inactive
Data source is preserved but not available for scanning
Hidden from most data source selection lists
Configuration is retained for future use
Useful for temporarily disabling data sources
Can be reactivated by changing status back to Active
Deleted
Data source is marked for deletion
Configuration may be retained for audit purposes
Typically hidden from user interfaces
Use when data source is no longer needed
May be permanently removed in future cleanup operations
Impact on Operations:
Only Active data sources appear in scan configuration dropdowns
Only Active data sources can be scanned
Inactive/Deleted data sources retain their configuration and scan history
Status changes are logged for audit purposes
Data Source Tags
Optional metadata tags for organizing and categorizing data sources.
Purpose
Tags provide flexible organization and filtering capabilities:
Categorization — Group related data sources
Environment Identification — Tag as PROD, DEV, TEST, etc.
Department/Team — Identify data source ownership
Data Classification — Mark sensitive or public data
Custom Organization — Any organizational scheme
Usage
Optional Field — Not required to save data source
Free Text — Any text value accepted
No Validation — System does not enforce tag format
Search/Filter — Can be used for filtering in views and reports
Examples:
PROD, CUSTOMER_DATA, PIIDEV; TEST_DATA; API_RESPONSESANALYTICS_TEAM, PUBLIC, EVENTS
Connection Status
Read-only indicator showing whether DataPancake can successfully connect to the source object.
Status Values
Connected
DataPancake has successfully validated connection to source object
Required privileges are granted
Data source is ready for scanning
Connection validated on save or periodic checks
Connection Failed
DataPancake cannot access the source object
Missing required privileges
Object may not exist or be inaccessible
Error message provides specific failure reason
Connection Validation
Connection status is validated when:
Data source is first created
Data source settings are saved
Periodic background validation (if enabled)
Field Requirements Summary
Required Fields
All data sources require:
Data Source Name — Unique identifier
Data Source Type — Semi-Structured or Structured
Status — Active, Inactive, or Deleted
Object Type — Table, View, External Table, etc.
Database — Source database name
Schema — Source schema name
Object Name — Source object name
Additional Required for Semi-Structured
Format Type — JSON, Avro, Parquet, ORC, or XML
Column Data Type — VARIANT or String
Column Name — Name of VARIANT/String column
Optional Fields
Data Source Tags — Organizational metadata
Schema Sample — Sample JSON/XML for faster discovery (semi-structured)
For information on adding data sources, see Adding Data Sources. For product tier configuration, see Product Tiers & Features.
Last updated
Was this helpful?