hands-on lab
Performing Data Quality Checks Using an Amazon Athena Notebook
Difficulty: Beginner
Duration: Up to 1 hour
Students: 12
Get guided in a real environmentPractice with a step-by-step scenario in a real, provisioned environment.
Learn and validateUse validations to check your solutions every step of the way.
See resultsTrack your knowledge and monitor your progress.
Description
Amazon Athena is a service from AWS that allows you to query data stored in Amazon S3 using SQL. When your analysis requires more complex calculations, you can use an Amazon Athena Notebook to process your data using Apache Spark.
Learning how to use an Amazon Athena Notebook will benefit anyone looking to improve their data analysis skills.
In this hands-on lab, you will connect to an Amazon Athena Notebook session and perform data quality checks using an Amazon Athena notebook.
Learning objectives
Upon completion of this beginner-level lab, you will be able to:
- Examine an Amazon Athena workgroup for use with an Amazon Athena notebook
- Connect to an Amazon Athena Notebook session
- Use a Jupyter Notebook interface to perform data quality checks
Intended audience
- Candidates for the AWS Certified Data Engineer Associate certification
- Cloud Architects
- Data Engineers
- Machine Learning Engineers
Prerequisites
Familiarity with the following will be beneficial but is not required:
- Amazon Athena
- Apache Spark
- Jupyter notebooks
The following content can be used to fulfill the prerequisites:
Environment before
Environment after
Covered topics
Lab steps
Logging In to the Amazon Web Services Console
Using an Athena Notebook