Deploy Data Science Infrastructure on AWS
This course introduces students to common deployment scenarios for Data Scientists on AWS. It goes over open source tools for deploying infrastructure such as AWS Parallelcluster and Terraform.

Skillset Outcomes

Intended Audience
This course is intended for system administrators or other IT professionals with at least some experience with managing SLURM or Kubernetes clusters. Some programming knowledge is useful, but not necessary.

Course Length
4 days

Course Delivery
This course can be delivered in person or remotely in real-time with Zoom and Slack.
Requirements

Computer
All students require a computer with an internet connection and at least 8 GBs of ram.
IDE
I recommend PyCharm professional, but Visual Studio code can also be used.
Other requirements
All students need Slack for communicating with code blocks, and optionally Zoom if the course will be delivered remotely.
Development Environment
Students can use Docker Desktop on their own laptops, or you can request a development environment for your students, which I build on AWS. Please note that I need 1 week of notice to deploy the development environment, and there is a charge.
Students must also have a way to SSH, either with Putty on Windows or OpenSSH Server on Mac/Linux.