hands-on lab

Implementing a Searchable Amazon S3 Data Lake

Difficulty: Beginner
Duration: Up to 1 hour and 30 minutes
Students: 725
Rating: 3.6/5
Get guided in a real environmentPractice with a step-by-step scenario in a real, provisioned environment.
Learn and validateUse validations to check your solutions every step of the way.
See resultsTrack your knowledge and monitor your progress.

Description

AWS Glue is a service that data analytics professionals can use to catalog, transform, and integrate data from different sources. By consolidating integration capabilities into a single centralized service, AWS Glue gives you the ability to discover, cleanse, catalog, and transform data in a single place.

Learning how to use AWS Glue to work with data will help you become more effective at creating and using data lakes in the public AWS cloud.

In this lab, you will implement an AWS Lambda function that processes order data as it is uploaded to Amazon S3, and you will see how to configure AWS Glue to make searching the data more efficient.

Learning Objectives

Upon completion of this beginner-level lab, you will be able to:

  • Use an AWS Lambda to normalize JSON data
  • Use Amazon EventBridge to invoke an AWS Lambda function in response to an event
  • Configure an AWS Glue table to use a partition index
  • Search data stored in Amazon S3 with Amazon Athena

Intended Audience

  • Candidates for the AWS Certified Data Analytics Specialty certification
  • Cloud Architects
  • Data Engineers
  • DevOps Engineers
  • Machine Learning Engineers
  • Software Engineers

Prerequisites

Familiarity with the following will be beneficial but is not required:

  • AWS Glue
  • Data Lakes
  • AWS Lambda
  • Amazon EventBridge
  • Amazon Athena

The following content can be used to fulfill the prerequisites:

Updates

June 5th, 2024 - Updated the instructions and screenshots to reflect the latest UI

February 15th, 2023 - Updated the Lambda implementation step with a test event

Environment before

Environment after

Covered topics

Lab steps

Logging In to the Amazon Web Services Console
Implementing Event Processing for Amazon S3 with an AWS Lambda Function
Creating an Amazon EventBridge Rule
Adding a Partition Index to an AWS Glue Table
Searching Within Your Indexed Amazon S3 Data