# Basic Configuration Settings

Core settings for data source identification, status, and metadata.

## Overview

Basic configuration settings define the fundamental properties of a data source: its name, type, operational status, and organizational tags. These settings are required for all data sources and determine how the data source appears and functions within DataPancake.

***

## Data Source Name

A unique, descriptive identifier for the data source within your DataPancake instance.

### Requirements

* Required Field — Must be provided to save the data source
* Unique — Cannot duplicate an existing data source name (case-insensitive)
* Descriptive — Should clearly identify the data source's purpose and content

**Examples:**

* `PRODUCTION.MEDICAL_DEVICE.UDI_JSON`
* `ANALYTICS.CUSTOMER_EVENTS`
* `STAGING.API_RESPONSES_JSON`
* `WAREHOUSE.ORDERS_TABLE`

**Validation:**

* Checks for uniqueness (case-insensitive comparison)
* Returns error if duplicate name exists
* Name cannot be empty or whitespace only

***

## Data Source Type

Classification of the data source as either Semi-Structured or Structured.

### Options

Semi-Structured (Recommended for JSON/VARIANT data)

* For data stored in VARIANT columns
* Enables full schema discovery and materialization
* Supports nested objects and arrays
* Requires format type and column specification
* Enables dynamic table generation

Structured

* For standard relational database objects
* Basic schema discovery only
* No materialization features
* No format or column specification needed
* Limited to flat relational structures

### Selection Impact

The data source type determines:

* Available Configuration Options — Semi-structured sources have more settings
* Materialization Capabilities — Only semi-structured sources can generate dynamic tables
* Schema Discovery Depth — Semi-structured sources discover nested structures
* Required Fields — Semi-structured requires format type and column name

Best Practice: Choose Semi-Structured for VARIANT column data to enable full DataPancake capabilities.

For detailed information on data source types, see [Data Source Types](https://docs.datapancake.com/core-concepts/data-sources/broken-reference).

***

## Status

Operational state of the data source within DataPancake.

### Options

Active (Default)

* Data source is operational and available for scanning
* Appears in data source lists and dropdowns
* Can be used in scan configurations
* Scans can be executed against this data source
* Use for all production and active development data sources

Inactive

* Data source is preserved but not available for scanning
* Hidden from most data source selection lists
* Configuration is retained for future use
* Useful for temporarily disabling data sources
* Can be reactivated by changing status back to Active

Deleted

* Data source is marked for deletion
* Configuration may be retained for audit purposes
* Typically hidden from user interfaces
* Use when data source is no longer needed
* May be permanently removed in future cleanup operations

**Impact on Operations:**

* Only Active data sources appear in scan configuration dropdowns
* Only Active data sources can be scanned
* Inactive/Deleted data sources retain their configuration and scan history
* Status changes are logged for audit purposes

***

## Data Source Tags

Optional metadata tags for organizing and categorizing data sources.

### Purpose

Tags provide flexible organization and filtering capabilities:

* Categorization — Group related data sources
* Environment Identification — Tag as PROD, DEV, TEST, etc.
* Department/Team — Identify data source ownership
* Data Classification — Mark sensitive or public data
* Custom Organization — Any organizational scheme

### Usage

* Optional Field — Not required to save data source
* Free Text — Any text value accepted
* No Validation — System does not enforce tag format
* Search/Filter — Can be used for filtering in views and reports

**Examples:**

* `PROD, CUSTOMER_DATA, PII`
* `DEV; TEST_DATA; API_RESPONSES`
* `ANALYTICS_TEAM, PUBLIC, EVENTS`

***

## Connection Status

Read-only indicator showing whether DataPancake can successfully connect to the source object.

### Status Values

Connected

* DataPancake has successfully validated connection to source object
* Required privileges are granted
* Data source is ready for scanning
* Connection validated on save or periodic checks

Connection Failed

* DataPancake cannot access the source object
* Missing required privileges
* Object may not exist or be inaccessible
* Error message provides specific failure reason

### Connection Validation

Connection status is validated when:

* Data source is first created
* Data source settings are saved
* Periodic background validation (if enabled)

<details>

<summary>Troubleshooting Connection Failures</summary>

When connection fails, the system provides specific error messages and SQL statements to resolve privilege issues.

For Standard Databases:

```sql
GRANT USAGE ON DATABASE <database_name> TO APPLICATION PANCAKE;
GRANT USAGE ON SCHEMA <database_name>.<schema_name> TO APPLICATION PANCAKE;
GRANT REFERENCES, SELECT ON <object_type> <database_name>.<schema_name>.<object_name> TO APPLICATION PANCAKE;
```

For Shared Databases:

```sql
GRANT USAGE ON DATABASE <database_name> TO APPLICATION PANCAKE;
GRANT USAGE ON SCHEMA <database_name>.<schema_name> TO APPLICATION PANCAKE;
GRANT IMPORTED PRIVILEGES ON DATABASE <database_name> TO APPLICATION PANCAKE;
```

Common Issues:

* Missing USAGE privilege on database or schema
* Missing SELECT privilege on object
* Object does not exist or was renamed
* Object is in a shared database requiring IMPORTED PRIVILEGES

For detailed information on privileges, see [Adding Data Sources](https://docs.datapancake.com/core-concepts/data-sources/broken-reference).

</details>

***

## Field Requirements Summary

### Required Fields

All data sources require:

* Data Source Name — Unique identifier
* Data Source Type — Semi-Structured or Structured
* Status — Active, Inactive, or Deleted
* Object Type — Table, View, External Table, etc.
* Database — Source database name
* Schema — Source schema name
* Object Name — Source object name

### Additional Required for Semi-Structured

* Format Type — JSON, Avro, Parquet, ORC, or XML
* Column Data Type — VARIANT or String
* Column Name — Name of VARIANT/String column

### Optional Fields

* Data Source Tags — Organizational metadata
* Schema Sample — Sample JSON/XML for faster discovery (semi-structured)

***

For information on adding data sources, see [Adding Data Sources](https://docs.datapancake.com/core-concepts/data-sources/adding-data-sources). For product tier configuration, see [Product Tiers & Features](https://docs.datapancake.com/core-concepts/data-sources/product-tiers-and-features).
