hands-on lab

Machine Learning with scikit-learn

Difficulty: Beginner
Duration: Up to 1 hour
Students: 356
Rating: 5/5
Get guided in a real environmentPractice with a step-by-step scenario in a real, provisioned environment.
Learn and validateUse validations to check your solutions every step of the way.
See resultsTrack your knowledge and monitor your progress.

Description

The aim of this lab is to challenge you on building a supervised machine learning pipeline to predict the median values of owner-occupied housing in USD 1000, denoted as MEDV. We are going to famous Boston dataset, which contains a set of different features that are used to predict the MEDV target variable. Here, you will be guided with an hands-on exercise on data preprocessing, fitting and evaluation of a regression model.

To get the most from this lab, it is recommended to have confidence and exposure to the following libraries: `pandas`, `matplotlib` and `scikit-learn`.

I strongly encourage you to have watched the following courses, available in our content library:

before starting this lab.

 

Learning Objectives

Upon completion of this lab you will be able to:

  • Build a standard machine learning pipeline with scikit-learn;
  • Scale a dataset using the StandardScaler transformer;
  • Train a Ridge Regression Model;
  • Fit a scikit-learn Pipeline object with a GridSearchCV.

Intended Audience

This lab is intended for:

  • Those interested in performing machine learning with Python.
  • Anyone involved in data science pipelines.

Prerequisites

You should possess:

  • An intermediate understanding of Python.
  • Basic knowledge of the following libraries: pandas, scikit-learn, matplotlib, seaborn.

Updates

August 8th, 2024 - Resolved Jupyter Notebook issue

Covered topics

Lab steps

Machine Learning with Python - Data Transformation