Workshop 1 - Data Wrangling and Feature Engineering
Machine Learning projects start with cleaning and transformation of the data and selecting the most relevant features (attributes). This is what you will practice using this lab. You will look at selecting a subset from the available data, filtering columns and rows and dealing with missing values and outliers. You will then move to the types of transformations that may be needed for numerical and categorical data. You will also explore ways of combining data sets through grouping and aggregation.
Following the data preparation, you will explore different methods of choosing the most relevant (informative) features to include in a model.
The lab for the workshop:
This lab is a sandbox allowing the learners to examine and run available Jupyter notebooks, and to create their own based on tasks given to them. The lab holds data sets (csv files), Jupyter notebook and instructions. It allows the user to download the work to their computer.
Upon completion of this intermediate level lab, you will be able to:
Familiarity with the following will be beneficial but is not required: