Optimizing and Deploying GPT-2 Models on AWS Neuron for Feature Extraction
Difficulty: Beginner
Duration: 9 minutes and 56 seconds
Students: 39
In this lesson, expert instructor Deniz Yilmaz will show you how to utilize the AWS Deep Learning AMI with the Neuron SDK and PyTorch on AWS Inferentia to compile and execute the HuggingFace GPT-2 model.
Learning Objectives
- Request a service quota increase to launch Inferentia instances
- Launch an AWS Deep Learning AMI on an Inf2 instance
- Establish a secure SSH connection to the Inf2 instance
- Activate the Neuron environment, verify installation of key packages, and run Neuron tool commands
- Establish SSH tunneling for secure Jupyter Notebook access
- Launch and configure a Jupyter Notebook environment
- Optimize and deploy HuggingFace GPT-2 model on AWS Inf2 instances
- Conduct performance tests to compare inference times between CPU and Neuron-powered GPU instances
Intended Audience
- Data scientists, machine learning engineers, and developers with basic knowledge of machine learning
Prerequisites
- You should have a familiarity with basic machine learning terms
- For more information on these services please see our existing content here:
GitHub
To access the Python scripts and commands for compiling and executing the HuggingFace GPT-2 model, along with additional materials for this demonstration, please visit our GitHub repository here:
Covered Topics