Big Data Training Library
Learn to architect for scale, get hands-on with the leading big data tools, and reveal meaningful insights from data using services on Amazon Web Services, Microsoft Azure, and Google Cloud Platform. Content added and updated weekly.
Most popular
- LEARNING PATHAndrew LarkinLearn SQL - From Newbie to NinjaBeginnerDuration: Up to 14 hours and 47 minutesAuthor: Andrew Larkin; Difficulty: Beginner; Description: Go from Newbie to Ninja in the structured query language (SQL)!; Duration: Up to 14 hours and 47 minutes; Content Topics: SQL; This learning path has: 4 Courses, 1 Lab challenge, 3 Exams, 3 Hands-on labs
- LEARNING PATHDanny JesseeAWS Certified Data Engineer - Associate (DEA-C01) Certification Preparation for AWSIntermediateDuration: Up to 54 hours and 21 minutesAuthor: Danny Jessee; Difficulty: Intermediate; Description: Train to prepare for the new AWS Certified Data Engineer - Associate Certification (DEA-C01).; Duration: Up to 54 hours and 21 minutes; Content Topics: Serverless, Big Data, App Streaming, SQL; This learning path has: 11 Courses, 1 Lab challenge, 1 Exam, 20 Hands-on labs
- LEARNING PATHGuy HummelDP-203 Exam Preparation: Data Engineering on Microsoft AzureIntermediateDuration: Up to 21 hours and 41 minutesAuthor: Guy Hummel; Difficulty: Intermediate; Description: This course is designed to help you prepare for Microsoft's DP-203 Data Engineering on Microsoft Azure exam.; Duration: Up to 21 hours and 41 minutes; Content Topics: Big Data, Machine Learning, SQL; This learning path has: 16 Courses, 1 Lab challenge, 1 Resource, 1 Exam, 8 Hands-on labs
- LEARNING PATHCalculated SystemsThe Beginners Guide to Machine Learning and Artificial IntelligenceBeginnerDuration: Up to 4 hours and 45 minutesAuthor: Calculated Systems; Difficulty: Beginner; Description: This course is a gentle introduction and for those who want to gain entry-level experience in machine learning and artificial intelligence.; Duration: Up to 4 hours and 45 minutes; Content Topics: NoSQL, Machine Learning; This learning path has: 2 Courses, 2 Exams, 1 Hands-on lab
- LEARNING PATHCalculated SystemsThe Basics of Data Management, Data Manipulation and Data ModellingBeginnerDuration: Up to 12 hours and 2 minutesAuthor: Calculated Systems; Difficulty: Beginner; Description: Learning the basics of how to work with data with data sources, formats, databases, and SQL.; Duration: Up to 12 hours and 2 minutes; Content Topics: SQL; This learning path has: 2 Courses, 1 Exam, 3 Hands-on labs
- LEARNING PATHGuy HummelGoogle Professional Data Engineer Exam PreparationIntermediateDuration: Up to 27 hours and 48 minutesAuthor: Guy Hummel; Difficulty: Intermediate; Description: Designed to help you prepare for the Google Certified Professional Data Engineer Exam, this training will help you gain a solid understanding of GCP components.; Duration: Up to 27 hours and 48 minutes; Content Topics: NoSQL, Big Data, SQL, Machine Learning; This learning path has: 24 Courses, 1 Resource, 5 Exams, 8 Hands-on labs
Explore all library
- HANDS-ON LABAndrew BurchillGetting Started with Amazon RedshiftBeginnerDuration: Up to 1 hour and 45 minutesAuthor: Andrew Burchill; Difficulty: Beginner; Description: Learn how to use the Amazon Redshift service. Create a cluster with a database, copy data from S3, query data using SQL, and resize the cluster.; Duration: Up to 1 hour and 45 minutes; Content Topics: Big Data; This hands-on lab has: 8 Lab steps
- HANDS-ON LABParveen SinghUsing Azure Synapse Analytics to Query Data LakeIntermediateDuration: Up to 1 hour and 45 minutesAuthor: Parveen Singh; Difficulty: Intermediate; Description: Learn how to deploy and use Azure Synapse Analytics to query data stored in a data lake through T-SQL statements using a serverless SQL pool in this hands-on lab.; Duration: Up to 1 hour and 45 minutes; Content Topics: SQL, Big Data; This hands-on lab has: 7 Lab steps
- HANDS-ON LABStefano CascavillaHandling S3 Objects Events With Lifecycle Policies and Server Access LoggingIntermediateDuration: Up to 1 hourAuthor: Stefano Cascavilla; Difficulty: Intermediate; Description: In this lab, you will create an S3 bucket and will implement the lifecycle policies to handle actions done on the objects. You will also log the operations done.; Duration: Up to 1 hour; Content Topics: Storage; This hands-on lab has: 5 Lab steps
- HANDS-ON LABLogan RakaiUsing Azure Data Factory Pipelines to Copy DataIntermediateDuration: Up to 1 hour and 15 minutesAuthor: Logan Rakai; Difficulty: Intermediate; Description: Learn how to build and trigger data pipelines in Azure Data Factory in this lab.; Duration: Up to 1 hour and 15 minutes; Content Topics: Big Data; This hands-on lab has: 9 Lab steps
- LAB CHALLENGEAndrew BurchillCreate Amazon RDS Database Instance ChallengeBeginnerDuration: Up to 1 hourAuthor: Andrew Burchill; Difficulty: Beginner; Description: Put your Amazon RDS skills to the test in this hands-on challenge lab as you are tasked with creating a database.; Duration: Up to 1 hour; Content Topics: Amazon Web Services; This lab challenge has: 2 Lab steps
- HANDS-ON LABAndrew BurchillWorking with Special Characters and Anchors in Regular ExpressionsBeginnerDuration: Up to 30 minutesAuthor: Andrew Burchill; Difficulty: Beginner; Description: In this hands-on lab, you will work with regular expressions, learning how to use quantifiers, anchors, and capture groups to match patterns in text.; Duration: Up to 30 minutes; Content Topics: Development; This hands-on lab has: 2 Lab steps
- HANDS-ON LABAndrew BurchillEfficiently Storing Data in S3 for Data Analytics SolutionsBeginnerDuration: Up to 1 hour and 15 minutesAuthor: Andrew Burchill; Difficulty: Beginner; Description: Learn how to store data in Amazon S3 for use with Data Analytics services including Redshift and Athena. Learn how to structure analytic data to maximize performance and minimize costs.; Duration: Up to 1 hour and 15 minutes; Content Topics: Big Data; This hands-on lab has: 7 Lab steps
- HANDS-ON LABAndrew BurchillSessionizing Clickstream Data with Amazon Kinesis and Managed Apache FlinkBeginnerDuration: Up to 1 hour and 30 minutesAuthor: Andrew Burchill; Difficulty: Beginner; Description: In this lab you will see how to use Kinesis Data Analytics to sessionize clickstream data, and, you will see how to send the output data from Kinesis Data Analytics to DynamoDB using a Lambda function.; Duration: Up to 1 hour and 30 minutes; Content Topics: App Streaming; This hands-on lab has: 7 Lab steps
- HANDS-ON LABStefano CascavillaStructure and Analyze Data with Google BigQueryBeginnerDuration: Up to 45 minutesAuthor: Stefano Cascavilla; Difficulty: Beginner; Description: This Lab will show you the basic concepts of BigQuery and will allow you to handle data and query them in a real GCP environment.; Duration: Up to 45 minutes; Content Topics: NoSQL, Big Data; This hands-on lab has: 6 Lab steps
- HANDS-ON LABLogan RakaiAnalyzing IoT Data Using Azure Stream AnalyticsBeginnerDuration: Up to 30 minutesAuthor: Logan Rakai; Difficulty: Beginner; Description: Analyze Internet of Things (IoT) sensor data using Azure Stream Analytics to identify if there have been any IoT device failures in this Lab.; Duration: Up to 30 minutes; Content Topics: Internet of Things; This hands-on lab has: 5 Lab steps
- HANDS-ON LABGreg DeRenneQuery Encrypted Amazon S3 Data with Amazon AthenaBeginnerDuration: Up to 1 hour and 20 minutesAuthor: Greg DeRenne; Difficulty: Beginner; Description: Use Amazon Athena to query encrypted data on S3 and encrypt the query results in this hands-on real-environment lab.; Duration: Up to 1 hour and 20 minutes; Content Topics: Object Storage; This hands-on lab has: 7 Lab steps
- HANDS-ON LABGreg DeRenneGetting Started with Amazon Elastic MapReduceIntermediateDuration: Up to 1 hour and 45 minutesAuthor: Greg DeRenne; Difficulty: Intermediate; Description: Learn how to create an Amazon EMR (Elastic MapReduce) cluster and submit work to a cluster in this hands-on lab.; Duration: Up to 1 hour and 45 minutes; Content Topics: Analytics, Storage; This hands-on lab has: 7 Lab steps
- HANDS-ON LABStefano CascavillaStarting a Highly Available Graph Database With Amazon NeptuneIntermediateDuration: Up to 1 hour and 15 minutesAuthor: Stefano Cascavilla; Difficulty: Intermediate; Description: In this lab, you will create a DB subnet group, a Neptune Database, and you will perform some SPARQL commands.; Duration: Up to 1 hour and 15 minutes; Content Topics: Databases; This hands-on lab has: 7 Lab steps
- HANDS-ON LABAndrew BurchillAggregating Data with Amazon Managed Streaming for Apache Kafka (MSK)BeginnerDuration: Up to 2 hoursAuthor: Andrew Burchill; Difficulty: Beginner; Description: Learn how to aggregate data using Amazon Managed Streaming for Apache Kafka in this Hands-On lab.; Duration: Up to 2 hours; Content Topics: Analytics; This hands-on lab has: 7 Lab steps
- HANDS-ON LABAndrew BurchillUsing Regular Expressions Effectively in the Real WorldBeginnerDuration: Up to 1 hourAuthor: Andrew Burchill; Difficulty: Beginner; Description: Regular expressions are a powerful tool for searching and manipulating text. In this hands-on lab you will learn how to use them effectively in real-world scenarios.; Duration: Up to 1 hour; Content Topics: Development; This hands-on lab has: 2 Lab steps
- HANDS-ON LABAndrew BurchillConstructing Regular Expression Character ClassesBeginnerDuration: Up to 30 minutesAuthor: Andrew Burchill; Difficulty: Beginner; Description: In this hands-on lab, you will learn about the character classes and quantifiers elements of Regular Expressions, and use them to match patterns in text.; Duration: Up to 30 minutes; Content Topics: Development; This hands-on lab has: 2 Lab steps
- HANDS-ON LABAndrew BurchillCollecting Log Data with Kinesis Agent and Querying with Amazon AthenaBeginnerDuration: Up to 1 hour and 30 minutesAuthor: Andrew Burchill; Difficulty: Beginner; Description: Learn how to use the Amazon Kinesis Agent application to collect log files and learn how to use AWS Glue and Amazon Athena to query the log data.; Duration: Up to 1 hour and 30 minutes; Content Topics: App Streaming; This hands-on lab has: 6 Lab steps
- HANDS-ON LABStefano CascavillaRun SQL Queries and Analyze Data with Google Cloud SQLIntermediateDuration: Up to 1 hour and 15 minutesAuthor: Stefano Cascavilla; Difficulty: Intermediate; Description: In this lab, you will create two tables in a SQL PostgreSQL database, perform operations on them, monitor the resources usage and test that the atomicity property is respected by the database.; Duration: Up to 1 hour and 15 minutes; Content Topics: SQL; This hands-on lab has: 7 Lab steps
- HANDS-ON LABAdil IslamPower BI Desktop PlaygroundBeginnerDuration: Up to 4 hoursAuthor: Adil Islam; Difficulty: Beginner; Description: In this hands-on lab playground, you'll have the opportunity to play around with the PowerBI Desktop application and build queries, data models, and visualize data using reports.; Duration: Up to 4 hours; Content Topics: Big Data; This hands-on lab has: 2 Lab steps
- LAB CHALLENGEAndrew BurchillAWS Database Migration Service (DMS) ChallengeBeginnerDuration: Up to 1 hour and 10 minutesAuthor: Andrew Burchill; Difficulty: Beginner; Description: In this lab challenge, your database skills are tested as you are tasked to migrate data between two real RDS instances using AWS Database Migration Service.; Duration: Up to 1 hour and 10 minutes; Content Topics: Migration, Databases; This lab challenge has: 2 Lab steps
- LAB CHALLENGEMatt MartinezIntro to SQL ChallengeBeginnerDuration: Up to 1 hourAuthor: Matt Martinez; Difficulty: Beginner; Description: This lab challenge provides you with a SQL environment. You must perform several SQL tasks to complete the challenge.; Duration: Up to 1 hour; Content Topics: SQL; This lab challenge has: 3 Lab steps
- HANDS-ON LABAndrew BurchillConfiguring Distribution Styles and Table Access in Amazon RedshiftBeginnerDuration: Up to 1 hourAuthor: Andrew Burchill; Difficulty: Beginner; Description: Learn how to create tables, set distribution styles, and configure fine-grained access on an Amazon Redshift cluster in this hands-on lab.; Duration: Up to 1 hour; Content Topics: Amazon Web Services; This hands-on lab has: 5 Lab steps
- HANDS-ON LABAndrew BurchillProcessing Streaming Metadata using Amazon Kinesis Data StreamsBeginnerDuration: Up to 1 hour and 30 minutesAuthor: Andrew Burchill; Difficulty: Beginner; Description: Learn how to use Amazon Kinesis Data Streams with Amazon API Gateway and AWS Lambda in this hands-on lab.; Duration: Up to 1 hour and 30 minutes; Content Topics: Amazon Web Services; This hands-on lab has: 5 Lab steps
- HANDS-ON LABLogan RakaiComparing Google Cloud Big Data ServicesIntermediateDuration: Up to 45 minutesAuthor: Logan Rakai; Difficulty: Intermediate; Description: In this lab, you will be uploading data to BigQuery directly and using other big data GCP Services like Dataproc and Dataflow.; Duration: Up to 45 minutes; Content Topics: NoSQL, Big Data; This hands-on lab has: 5 Lab steps