Introduction#
Consistent naming across env (dev, test, prod), layers (bronze/silver/gold), and domains is critical in Databricks. It prevents confusion, enforces governance, and supports automation with Unity Catalog and Delta Lake.
General Best Practices#
- Separate dev / test / prod workspaces.
- Apply RBAC + Unity Catalog.
- Use modular notebooks; reuse with
%run. - Version control all code.
- Prefer job clusters; auto-terminate.
- Vacuum Delta tables; use optimize + z-order.
- Allow schema evolution only when intentional.
Environment‑Aware Medallion Naming#
Unity Catalog is the governance backbone. Inconsistent names break access policies and automation. Use env prefixes, clear domains, and snake_case. cf. Unity Catalog docs .
Pattern:
<env>_<domain>Examples: prod_sales, dev_marketing, test_finance
Layer‑Specific Schemas#
Pattern:
<env>_<domain>.<layer>Examples: prod_sales.bronze, prod_sales.silver, prod_sales.gold
Table Naming Within Layers#
Use snake_case, descriptive names.
bronze.transactions_raw
silver.customer_validated
gold.sales_monthlyFull name example: prod_sales.bronze.transactions_raw
File Storage Structure#
Mirror env/layer/domain/table in paths.
/mnt/data/<env>/<layer>/<domain>/<table>/Example: /mnt/data/prod/bronze/sales/transactions/
Summary Table#
| Level | Pattern | Example |
|---|---|---|
| Catalog | <env>_<domain> | dev_hr, prod_sales |
| Schema / Layer | <catalog>.<layer> | test_finance.bronze |
| Table Name | snake_case | silver.employee_cleaned |
| Full Name | <catalog>.<layer>.<table> | prod_sales.gold.monthly_rev |
| Storage Path | /mnt/data/<env>/<layer>/<domain>/<table>/ | /mnt/data/dev/bronze/marketing/ads/ |
Good vs Bad Examples#
Workspaces#
- Good:
dev,test,prodseparated (pre-prod when maturity allows). - Bad: mixed single workspace.
Clusters#
- Good: job clusters with auto-termination.
- Bad: idle interactive cluster.
Schemas / Layers#
- Good:
prod_sales.bronze. - Bad:
bronze1,myschema.
Tables#
- Good:
silver.customer_validated. - Bad:
CustomerData,table1.
Storage Paths#
- Good:
/mnt/data/prod/gold/finance/revenue_summary/. - Bad:
/mnt/data/finaltables/finance2024/.
Community Discussions#
| Link | Post | Date | Latest Reply |
|---|---|---|---|
| Reddit: r/databricks – Naming standards | Confusion scaling naming | Dec 2023 | Jan 2024 |
| Reddit: r/dataengineering – Unity Catalog naming | Environment “grown wild” | Nov 2023 | Dec 2023 |
| Reddit: r/dataengineering – Organizing Unity Catalog | Catalog-first vs domain-first | Aug 2024 | Aug 2024 |
| Stack Overflow – Catalog vs Database | Terminology confusion | Feb 2024 | Feb 2024 |
| Medium – Unity Catalog Principles | Governance risks from inconsistency | Sep 2023 | Sep 2023 |
| Medium – Best Practices | Domain separation pitfalls | Oct 2023 | Oct 2023 |
| Medium – Ultimate Guide | Confusion when diverging from snake_case | Jul 2024 | Jul 2024 |
References#
Official Docs
- Databricks Unity Catalog
- Azure Databricks naming best practices
- Databricks Medallion Architecture
- Azure Q&A – Organizing multiple source systems
Community & Practitioner Insights