Training and Deploying Custom Scikit-Learn Models on AWS SageMaker

End to End example using the Sci-Kit Learn Estimator with SageMaker Script Mode

Ram Vegiraju
6 min readAug 20, 2021
Image from Unsplash

Training and deploying Machine Learning models at scale can be computationally expensive. This has led to a rise in cloud providers such as AWS, Microsoft Azure, and more. SageMaker is Amazon’s primary Machine Learning service that enables developers to build, train, and deploy models at scale. SageMaker offers a Jupyter Notebook like environment that allows for developers to build custom models with popular frameworks such as SciKit-Learn, Tensorflow, and PyTorch. For this article we’ll explore training and deploying a sample Sci-Kit Learn Random Forest Regression Model on the Petrol Consumption Dataset.

NOTE: For those of you new to AWS, make sure you make an account at the following link if you want to follow along. I’d suggest having some familiarity with AWS and its core services such as IAM, S3, for this article. I’ll also provide a list of services we’ll be using along with more in-depth definitions. If you’re already familiar with these services, feel free to skip ahead to the code demonstration.

Table of Contents (ToC)

  1. AWS Services
  2. Prerequisites/Setup

--

--