Azure Databricks and Data Lake Storage Playground
Description
Azure Databricks is an analytics platform powered by Apache Spark. Spark is a unified analytics engine capable of working with virtually every major database, data caching service, and data warehouse provider.
In Databricks you have the option of working with Spark, Scala, and Python to manage, analyze, and visualize data. Notebooks in Databricks clusters provide the ability to programmatically interact with data from virtually any major data source.
The playground is a safe and secure sandbox environment for you to explore your own ideas, follow along with Cloud Academy courses, or answer your own questions all without the need to install any software on your local machine. The goal is to be able to experiment and learn with little start time or overhead. Feel free to experiment with loading data to ADLS, managing data and folders in ADLS using Databricks, working with Databricks clusters and notebooks, and more. Have fun in the playground!
Intended Audience
This lab is intended for:
- Azure administrators
- Cloud engineers and solutions architects
- Data engineers
- Anyone with a need to visualize and analyze data in Azure
Prerequisites
You should be familiar with:
- Basic familiarity with the Azure Portal is helpful, but not required
- Basic familiarity with Databricks is helpful. The lessons on using Azure Databricks to interact with ADLS data can help here.
Updates
March 1st, 2024 - Migrated to Azure Data Lake Storage Gen2
January 19th, 2023 - Updated screenshots & instructions to match UI changes
Nov 3rd, 2021 - Updated instruction to resolve the login issue with Azure Databricks
October 23rd, 2021 - Provide a workaround for an Azure Active Directory issue that initially prevents logging in to Databricks