This three-part blog series provides a comprehensive guide to setting up the Azure Databricks Unity Catalog within a VNet environment. The series covers aspects from the Unity Catalog's introduction, infrastructure and networking configurations to security, metastore setup and workspace integration.
- In Part 1, we provided an overview of the Unity Catalog and how to set it up.
- In Part 2, we covered the network setup process.
- This blog—Part 3—concludes the series by covering the process for configuring the metastore.
Let's configure that metastore.
Configure metastore
Step 1: Create metastore
A metastore is a metadata repository for the Unity Catalog that manages catalogs, schemas, tables and permissions.
- Go to Databricks UI → click on Manage accounts (or go here: https://accounts.azuredatabricks.net)
- Click on Admin Settings → Create Metastore
- Enter details:
- Metastore name: (e.g., my_unity_metastore)
- Region: Must match the Databricks workspace and Access Connector (one metastore per region)
- Storage root path: e.g., abfss://<directory_name>@<storageaccount>. dfs.core.windows.net/
- As a best practice, it is NOT recommended to store any application data in the root blob (DBFS) storage
- It is highly recommended to store the application data on an external ADLS Gen2 Storage
- Access connector: Select the one created (databricks-access-connector)
- Click Create
- Assign permissions to the metastore
Step 2: Attach workspace to metastore
- Go to Databricks Admin Settings → Workspaces
- Select the Databricks Workspace
- Click Assign Metastore and select the metastore created earlier (my_unity_metastore)
- Click Confirm
Step 3: Enable catalog features
Enable serverless compute
Serverless compute needs to be enabled for any model to access the custom function (SQL/Python).
- In the account console, click Settings
- Click the Feature enablement tab
- Enable the Serverless compute for workflows, notebooks and DLT setting
Step 4: Validate setup
- Deploy a virtual machine with the VNet if the front-end access is disabled
- Go to the Azure Databricks workspace, launch and create a cluster
- Create a notebook and attach it to the cluster and test
- Go to SQL Editor, launch a serverless warehouse and run a DDL/DML to verify storage access
- Create a SQL/Python function and go to the playground to access it using a model to verify serverless compute
Conclusion: Wrapping up Part 3
Setting up Azure databricks Unity Catalog within a Vnet environment ensures secure, scalable and governed access to structured and unstructured data. Configuring networking, security, metastore and workspace integrations can establish a centralized data governance model while maintaining fine-grained access control through Azure AD, Unity Catalog and Databricks Access Connectors.
With proper identity management, role-based access control and storage security policies, enterprises can collaborate seamlessly across multiple workspaces while ensuring compliance and security in the data architecture. By following this guide, we can successfully implement Unity Catalog in a Vnet setup, empowering organizations with efficient, secure and well-governed data management.
If you missed the first two blogs in the series, you can find them here:
References
What is Unity Catalog? - Azure Databricks | Microsoft Learn
Networking - Azure Databricks | Microsoft Learn
Classic compute plane networking - Azure Databricks | Microsoft Learn
Enable Azure Private Link as a standard deployment - Azure Databricks | Microsoft Learn