Using Mount Points in Databricks: A Practical Guide for Data Engineers (2024)

Using Mount Points in Databricks: A Practical Guide for Data Engineers (2)

1.What are Mount Points in Databricks?

2.How Do Mount Points Work?

3.How Can I Mount a Cloud Object Storage on DBFS?

4.How Do I Access My Data Stored In a Cloud Object Storage Using Mount Points?

5.Why and When Do You Need Mount Points?

6.When Should You Use Unity Catalog Instead of Mount Points?

7.Best Practices for Using Mount Points

Mount points in Databricks serve as a bridge, linking your Databricks File System (DBFS) to cloud object storage, such as Azure Data Lake Storage Gen2 (ADLS Gen2), Amazon S3, or Google Cloud Storage. This setup allows you to interact with your cloud storage using local file paths, as if the data were stored directly on DBFS.

Mounting creates a linkage between a Databricks workspace and your cloud object storage.

A mount point encapsulates:

  • The location of the cloud object storage.
  • Driver specifications for connecting to the storage account or container.
  • Security credentials for data access.

You can list your existing mount points using the below dbutils command:

# Also shows the databricks built in mount points (e.g., volume, databricks-datasets)
# Just ignore them
dbutils.fs.mounts()

Or directly using the Databricks Workspace UI, in the Catalog Explorer you can click Browse DBFS:

Using Mount Points in Databricks: A Practical Guide for Data Engineers (3)

And in the opened tab, simply click the “mnt”. It will ask you choose a cluster. Choose/start your cluster. Finally, you can see all your mount points (if there is any).

For Azure environments, mounting ADLS Gen2 using Azure Active Directory (AAD) or with the new name Microsoft Entra ID OAuth is a common practice. Here’s how you can do this:

configs = {
"fs.azure.account.auth.type": "OAuth",
"fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id": "<application-id>",
"fs.azure.account.oauth2.client.secret": dbutils.secrets.get(scope="<scope-name>", key="<service-credential-key-name>"),
"fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/<directory-id>/oauth2/token"
}
dbutils.fs.mount(
source = "abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/",
mount_point = "/mnt/<mount-name>",
extra_configs = configs
)

When configuring your mount, it’s important to understand the configs dictionary and your Azure AD setup. Specifically, the fs.azure.account.oauth2.client.id should be set to your Service Principal (SP) ID, which acts as a unique identifier for your application in Azure AD. Similarly, the fs.azure.account.oauth2.client.secret parameter requires the secret associated with your SP. These credentials enable secure authentication and authorization, ensuring that only authorized entities can access your cloud object storage. Additionally, ensure you have assigned the appropriate roles and necessary permissions to your Service Principal in the Storage Account. You can learn more about this process: https://learn.microsoft.com/en-us/azure/databricks/connect/storage/aad-storage-service-principal.

Remember, the configuration mentioned above, is specific to Azure ADLS Gen2 Storage Account. Adjustments are necessary for other cloud providers.

To unmount simply:

dbutils.fs.unmount(mount_point="/mnt/<mount-name>")

Once mounted, accessing your data (e.g., Delta Table) is as straightforward as referencing the mount point in your data operations:

# Using spark, read delta table by the path
df = spark.read.load("/mnt/my_mount_point/my_data")

# Using spark, write back to the mount point
df.write.format("delta").mode("overwrite").save("/mnt/my_mount_point/delta_table")

Using mount points was the general practice for accessing cloud object storage before the unity catalog was introduced.

  • You want to access your cloud object storage as if it is on DBFS
  • Unity Catalog is not activated in your workspace
  • Your cluster runs on a Databricks runtime (DBR) version older than 11.3 LTS
  • You have no access to a premium workspace plan (i.e., Standard plan)
  • If you want to avoid mount points and still can not use Unity Catalog (UC), you can set your Service Principal (SP) credentials in the spark configuration and access the ADLS Gen2 containers as well.
  • The above conditions don’t apply to you.
  • You can use cluster with a later DBR version (>= 11.3 LTS) and have access to premium plan
  • Mounted data doesn’t work with Unity Catalog.
    - However, you can still see your tables and their referenced mount point paths in the old hive_metastore catalog if you migrated to UC.
  • When doing mounting operations, manage your secrets using secret scopes and never expose raw secrets
  • Keep your mount points up-to-date
    - In case a source doesn’t exist anymore in the storage account, remove the mount points from Databricks as well
  • Using the same mount point name as your container name can make things easier if you have many mount points. Especially, if you come back to your workspace after some time, you can easily match them with the Azure Storage Explorer.
  • Don’t put non-mount point folders and other files in the /mnt/ directory. They will confuse you.
  • If your SP credentials get updated, you might have to remount your all mount points again:
    - You can loop through the mount points if all the mount points are still pointing to existing sources.
    - Otherwise, you will get AAD exceptions and have to manually try unmounting and mounting each mount point.
  • If you can, use Unity Catalog (UC) instead of mount points for better data governance, centralized metadata management, fine-grained security controls and a unified data catalog across different Databricks workspaces.

REFERENCES

Using Mount Points in Databricks: A Practical Guide for Data Engineers (2024)
Top Articles
Credit Scores Education Article
Buy and maintain credit
Craigslist Livingston Montana
11 beste sites voor Word-labelsjablonen (2024) [GRATIS]
Kreme Delite Menu
What happened to Lori Petty? What is she doing today? Wiki
Www.politicser.com Pepperboy News
Unraveling The Mystery: Does Breckie Hill Have A Boyfriend?
Waive Upgrade Fee
Which aspects are important in sales |#1 Prospection
Rls Elizabeth Nj
Toonily The Carry
Sport Clip Hours
What Happened To Maxwell Laughlin
Cashtapp Atm Near Me
Daily Voice Tarrytown
Craigslist Free Stuff Greensboro Nc
Gdp E124
Roll Out Gutter Extensions Lowe's
G Switch Unblocked Tyrone
Vanessawest.tripod.com Bundy
Weepinbell Gen 3 Learnset
Nevermore: What Doesn't Kill
Rural King Credit Card Minimum Credit Score
Beverage Lyons Funeral Home Obituaries
Kingdom Tattoo Ithaca Mi
Rs3 Ushabti
Violent Night Showtimes Near Amc Dine-In Menlo Park 12
Mdt Bus Tracker 27
Darrell Waltrip Off Road Center
Speedstepper
Busted Mugshots Paducah Ky
Doctors of Optometry - Westchester Mall | Trusted Eye Doctors in White Plains, NY
Mini-Mental State Examination (MMSE) – Strokengine
N.J. Hogenkamp Sons Funeral Home | Saint Henry, Ohio
Ucm Black Board
Wcostream Attack On Titan
Σινεμά - Τι Ταινίες Παίζουν οι Κινηματογράφοι Σήμερα - Πρόγραμμα 2024 | iathens.gr
Joplin Pets Craigslist
Junior / medior handhaver openbare ruimte (BOA) - Gemeente Leiden
Hometown Pizza Sheridan Menu
Electronic Music Duo Daft Punk Announces Split After Nearly 3 Decades
Colorado Parks And Wildlife Reissue List
Top 40 Minecraft mods to enhance your gaming experience
Online-Reservierungen - Booqable Vermietungssoftware
Greg Steube Height
Unblocked Games 6X Snow Rider
Slug Menace Rs3
Lira Galore Age, Wikipedia, Height, Husband, Boyfriend, Family, Biography, Net Worth
Causeway Gomovies
2121 Gateway Point
Latest Posts
Article information

Author: Greg Kuvalis

Last Updated:

Views: 5755

Rating: 4.4 / 5 (55 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Greg Kuvalis

Birthday: 1996-12-20

Address: 53157 Trantow Inlet, Townemouth, FL 92564-0267

Phone: +68218650356656

Job: IT Representative

Hobby: Knitting, Amateur radio, Skiing, Running, Mountain biking, Slacklining, Electronics

Introduction: My name is Greg Kuvalis, I am a witty, spotless, beautiful, charming, delightful, thankful, beautiful person who loves writing and wants to share my knowledge and understanding with you.