Deploy RShiny with Kubernetes using AWS EKS and Terraform

aws eks kubernetes r rshiny terraform Apr 23, 2020

Deploy RShiny on AWS EKS with Terraform

Today we will be deploying the rocker/shiny image on Kubernetes with AWS, or EKS.

If you're following along with the deploy RShiny on AWS Series, you'll know that I covered deploying RShiny with a helm chart . Today, I want to go deeper into deploying RShiny on EKS, along with some tips and tricks that I use for my everyday deployments.

If you'd like to learn more about deploying RShiny please consider checking out my FREE Deploy RShiny on AWS Guide!

An Extremely Brief Introduction to Kubernetes

Kubernetes is kind of a beast to get started with, and people constantly complain that its extremely complicated to get started with. They would be correct, but I'm here to give the 2 minute rundown of what you need to know to deploy your RShiny (or Dash, Flask, Django, Ruby on Rails, etc) application on Kubernetes. This is because Kubernetes is not magical, and it's not even that new. It's a very nice abstraction layer on top of infrastructure concepts that already exist.

Disclaimer! I know I am oversimplifying things here, and I am totally OK with that. ;-) There is quite a bit more that goes into Kubernetes that what I discuss here, and I am not an expert.

With that bit of ranting out of the way, let's talk about the layers of Kubernetes!

Data Persistence Layer

Have you ever deployed an application and it has some data that persists, even when the machine gets shut off? This must be just data persisted to a file system, or maybe even a networked file system if you're fancy, a database, S3 or GCP storage, you get the idea. You need the data to stick around. Even a database isn't magical. Somewhere, buried in that configuration, is a data directory where it stores it's data.

Kubernetes takes care of this using PVCs, or Persisted Volume Claims. Underneath the hood, these are just mappings to the filesystem or even networked filesystems such as AWS EFS.

Compute Layer

This is a physical entity. If you've ever SSHed on over to a remote server you have used a compute layer. The compute layer is one or more servers, nodes, or whatever you'd like to call them.

Kubernetes calls these Pods, and they are the backbone of our applications.

Application Layer

If you've come here to read this article this will be the most interesting part for you. If you're a software engineer you are most likely deploying applications, maybe the occasional database or cache.

Applications are dealt with by Kubernetes by creating a specification within your Pod called containers. These are your docker containers, which generally map to the application you want to run. In our case, we'll have an RShiny container. A single Pod can run one or more containers, meaning it can have one or more applications. There are all kinds of fanciness in Kubernetes where you can tag particular Pods to correspond with particular applications. We are not going to talk about any of that, but it's nice to know that if you need that level of control it's there for you.

Services Layer

Now, this is where we get our application to talk to the world! Or not. You may only want your application to be able to communicate with other applications in your Kubernetes cluster, in which case you wouldn't want to use a service.

This layer builds on the application layer. You tell Kubernetes to create you a Service, and tell it which pods to expose from which containers. Then you can talk to your applications from outside of the Kubernetes cluster.

Back to RShiny

Ok, now that we've talked a bit about Kubernetes, let's get back to our RShiny deployment!

What we're going to do here is to create an EKS Cluster with Terraform, an infrastructure as code tool with all kinds of recipes to do all kinds of awesomeness, and then we're going to deploy our RShiny helm chart.

Setup our Terraform Configuration and Helm Chart

Remember how I said terraform has all kinds of precooked infrastructure just ready and waiting for you? Well, we're going to steal that!

Let's take a look at the basic example from the terraform-aws-modules github org. There's lots of great stuff in there, so check it out!

Here's the directory structure

project/
    Dockerfile
    eks/
        main.tf
        variables.tf
        outputs.tf
        helm_charts/
            rshiny-eks
    terraform-state
        main.tf

Create it with:

export CWD=$(pwd)
mkdir -p project/eks/helm-charts
mkdir -p project/terraform-state
cd projects/eks
wget https://raw.githubusercontent.com/terraform-aws-modules/terraform-aws-eks/master/examples/basic/variables.tf
wget https://raw.githubusercontent.com/terraform-aws-modules/terraform-aws-eks/master/examples/basic/outputs.tf
 wget https://raw.githubusercontent.com/terraform-aws-modules/terraform-aws-eks/master/examples/basic/main.tf
cd helm_charts
wget https://dabble-of-devops-helm-charts.s3.amazonaws.com/rshiny-eks-0.1.0.tgz
tar -xvf rshiny-eks-0.1.0.tgz
cd ${CWD}

Go take a look at the project/eks/main.tf file. Search for locals, and change those variables names to something that makes sense for your project.

Install anything to my local system? I'm not doing that!

I don't install anything locally to my computer beyond my IDE and docker and I recommend that you don't either! Then for each project I create a Dockerfile that has what I need for that project. You can either install terraform and the AWS cli locally, or you can just use this Dockerfile.

FROM continuumio/miniconda3:latest

# This image is just to get the various cli tools I need for the aws eks service
# AWS CLI - Whatever the latest version is
# AWS IAM Authenticator - 1.12.7
# Kubectl - 1.12.7

RUN apt-get update -y; apt-get upgrade -y; \
    apt-get install -y curl vim-tiny vim-athena jq

WORKDIR /tmp

ENV PATH=/root/bin:$PATH
RUN echo 'export PATH=$HOME/bin:$PATH' >> ~/.bashrc
RUN echo 'alias l="ls -lah"' >> ~/.bashrc

RUN pip install --upgrade ipython awscli troposphere typing boto3 paramiko

# Install clis needed for kubernetes + eks

RUN curl -o aws-iam-authenticator \
    https://amazon-eks.s3-us-west-2.amazonaws.com/1.12.7/2019-03-27/bin/linux/amd64/aws-iam-authenticator
RUN chmod +x ./aws-iam-authenticator

RUN mkdir -p ~/bin && cp ./aws-iam-authenticator ~/bin/aws-iam-authenticator

RUN curl -o kubectl \
    https://amazon-eks.s3-us-west-2.amazonaws.com/1.12.7/2019-03-27/bin/linux/amd64/kubectl
RUN chmod +x ./kubectl
RUN mv ./kubectl ~/bin/kubectl

WORKDIR /tmp

RUN curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
RUN chmod 700 get_helm.sh
RUN ./get_helm.sh

# Get terraform

RUN wget https://releases.hashicorp.com/terraform/0.12.20/terraform_0.12.20_linux_amd64.zip \
 && unzip terraform_0.12.20_linux_amd64.zip \
 && mv terraform /usr/local/bin \
 && rm terraform_0.12.20_linux_amd64.zip

WORKDIR /root

# If you want to copy your aws credentials to the container for it
# I prefer to just bind it as a volume
#RUN mkdir -p /root/.aws
#COPY config /root/.aws/config
#COPY credentials /root/.aws/credentials

Put this in your project folder.

docker build -t eks-terraform .
# Then just drop into a shell
docker run -it \
    -v "$(pwd)":/project \
    -v "$(pwd)/.kube":/root/.kube \
    -v "$(pwd)/.aws":/root/.aws \
    eks-terraform bash

A quick note about your AWS credentials. I prefer to keep a copy of AWS credentials in a folder per project instead of in my ${HOME}. If you have your AWS credentials in ${HOME} then use that as a volume.

Set up our Terraform State

Terraform has this internal tracking method called state. It uses this to keep track of what resources have been created, destroyed, if it errors, etc. It can keep track of it's state with a file, but I'd rather keep track of the state using an S3 bucket. This is a layer of future proofing for deploying using CI, or having multiple people working on the same project.

We can get terraform to create the necessary resources.

# project/terraform-state/main.tf
# Change the PREFIX to something that makes sense
# And the AWS_REGION to the region you want to deploy in
variable "prefix" {
  type = string
  default = "PREFIX"
}

data "aws_region" "current" {}

provider "aws" {
  version = "~> 2.0"
  region  = "MY_AWS_REGION"
}

resource "aws_s3_bucket" "terraform-state" {
  bucket = "${var.prefix}-terraform-state"
  acl = "private"
  region = data.aws_region.current.name

  versioning {
    enabled = true
  }

  tags = {
    Name = "PREFIX"
  }
}

resource "aws_dynamodb_table" "terraform-state-lock" {
  name = "${var.prefix}-terraform-state-lock"
  read_capacity  = 1
  write_capacity = 1
  hash_key = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }
}

Use the Terraform State in our EKS project

Now we'll create an aws.tf in our project/eks and point that to the state we created.

# project/eks/aws.tf
provider "aws" {
  version = "~> 2.0"
  region = "us-east-1"
}

data "aws_region" "current" {}

data "aws_availability_zones" "available" {
}

variable "prefix" {
  type = string
  default = "PREFIX"
}

# variables are not allowed in the terraform def

terraform {
  backend "s3" {
    bucket = "PREFIX"
    key = "terraform/terraform-dev.tfstate"
    region = "us-east-1"
    encrypt = true
    dynamodb_table = "PREFIX-terraform-state-lock"
  }
}

Deploy!

Now that we have our configuration all setup, and our prefixes nicely named something that makes sense and is not stuff, we can start to deploy!

cd project/terraform-state
terraform init; terraform refresh; terraform apply
# If you want this on a CI use this instead
#terraform init; terraform plan; terraform apply -auto-approve
cd ../eks
terraform init; terraform refresh; terraform apply

The first time you run this it will take awhile. Like go make yourself a snack kind of awhile.

Once it's done we'll want to update our kubectl config to let it know about our shiny new Kubernetes cluster!

When terraform runs it has a handy functionality called outputs. This is especially useful for configuration commands that depend upon resource IDs. For more information look at the project/eks/outputs.tf where you will see several examples.

You'll find an output called kubectl_config. Run it to point kubectl to your new cluster.

It will also benefit you to know how to do this manually, so what you do is:

aws eks --region $AWS_REGION update-kubeconfig --name $NAME
# Where name is the name defined in project/main.tf
# locals {
#  cluster_name = "test-eks-${random_string.suffix.result}"
# }

Install the Helm Chart

Alright! We've got out cluster! We've told kubectl all about our cluster. Now what do we want? RShiny!

helm dep up project/eks/helm_charts/rshiny
helm upgrade --install rshiny project/eks/helm_charts/rshiny \
    --set service.type=LoadBalancer --wait

That command will take a bit of time, but this is more of a tea break rather than a snack break.

Check out our new RShiny App on AWS

That was a mission, but here we are! To make sure everything is all set let's run some commands to see what's happening on our EKS cluster.

Pods

This will list the pods that are up with the rshiny app running. They should say something like 'Init' or 'Running'. Anything with 'Failed' or 'Crash' is bad and means something went sideways.

kubectl get pods |grep rshiny
# Kubectl has a json output option
# Between kubectl -o json and the command line tool jq you can get whatever you want!
export RSHINY_POD=$(kubectl get pods -o json | jq -r '.items[] | select( .metadata.labels["app.kubernetes.io/name"]=="rshiny") ' | jq -r '.metadata.name' )
echo $RSHINY_POD
kubectl describe pod $RSHINY_POD
kubectl logs $RSHINY_POD

Troubleshooting your application

The easiest way to troubleshoot anything is to just drop into a shell.

kubectl exec -it ${RSHINY_POD} bash

If you can't even get your container to start, try going into project/eks/helm_charts/templates/deployment.yaml and adding a sleep command.

# project/eks/helm_charts/templates/deployment.yaml
      containers:
        - name: {{ .Chart.Name }}
          # Add in a sleep command here
          command: ["sleep"]
          args: ["1h"]

And then redeploy with helm.

helm upgrade --install rshiny project/eks/helm_charts/rshiny \
    --set service.type=LoadBalancer --wait

Then see what you can see.

Services

Now, let's go and check out our web service!

kubectl get svc |grep rshiny

Look for the LoadBalancer entry. Either there will be a web address or it will be pending. If its pending go make some more tea and come back. Once it's up you'll see the IP address. Grab that, throw it in a browser. If you don't see your application right away don't start freaking out just yet. Wait for a little and it will come up!

Wrap Up

That's it. You should now have an example RShiny app on AWS that you can build on! Try it out and let me know how it goes!

If you'd like to learn more about how I deploy RShiny Apps on AWS I have a Free Guide available that includes deploying with Docker, Lightsail, EC2, and Kubernetes when you sign up for my DevOps for Data Scientists Weekly Tutorial List!

If you have any questions or would like to request a tutorial topic please reach out to me at [email protected]. Happy teching!

Bioinformatics Solutions on AWS Newsletter

Get the first 3 chapters of my book, Bioinformatics Solutions on AWS, as well as weekly updates on the world of Bioinformatics and Cloud Computing, completely free, by filling out the form next to this text.

Bioinformatics Solutions on AWS

If you'd like to learn more about AWS and how it relates to the future of Bioinformatics, sign up here.

We won't send spam. Unsubscribe at any time.