My favorite way to deploy RShiny locally is to simply package it into a docker image and run it. Running any application in docker makes it easily transportable, and is a generally acceptable way of distributing applications in 2019.
This solution does not require rshiny-server or shinyapps.io, not because I have anything against either of those solutions. I just tend to stick to a few favorite deployment methods to keep my head from spinning straight off my body. ;-)
If you're not familiar with docker then I have a FREE course available. The first module is plenty to
get you up and running and can be completed in an hour or two. For the most part, if you can use a command line you can use docker.
Now that we've covered some housekeeping let's get started building your docker image. Like any project, you want to have all your relevant code in a directory. This way it is accessible to the docker build process.
I've been having a blast the last few months learning ApacheAirflow. It's become an indispensable tool in my getting stuff done toolbox.
Follow me on Twitter @jillianerowe, or send me a message with questions, topic suggestions, ice cream recommendations, or general shenanigans.
AWS Elastic Kubernetes Service (EKS) is a fully managed service AWS launched recently. Elastic services in AWS it means that the number of instances actually in use scales up or down based on the demand. This is first of all seriously cool, and second of all can cut down on costs. Fewer requests? Fewer nodes!
I'm just getting started with Kubernetes myself, and going through this walkthrough was a great learning exercise.
I love deploying applications with docker swarm because it's fairly simple and I already know it, however, Swarm for AWS has some downsides. Firstly, it is not elastic, secondly, in order to get sticky sessions you need to add an additional service such as Traefik. With session affinity you can deploy RShiny and Python Dash applications with no other functionality besides the built in, and that's amazing!
I also personally think the moving towards Kubernetes over Swarm. It even comes installed on my mac version of Docker. Now is a great time to get...
If you've read anything on this blog you will know I am all about docker and I am all about world domination with distributed computing, particularly with Python Applications.
This week marks something new for me. I've written plenty of documentation, oodles of Power Point presentations with accompanying in-person training. This is the first for me to start recording myself for video, and it was an exciting start! It feels a bit silly, because there are probably millions of people on YouTube or other video formats by now, but it was a first for me.
If you want to see a deeper dive on building python applications in Docker containers check out my Develop a Python Application in Docker and Deploy to AWS series.
Please watch, comment, share, and let me know what you think!
During the previous parts in this series, I introduced Apache Airflow in general, demonstrated my docker dev stack, and built out a simple linear DAG definition. I want to wrap up the series by showing a few other common DAG patterns I regularly use.
In order to follow along, get the source code!
unzip airflow-template.zip cd airflow-template docker-compose up -d docker-compose logs airflow_webserver
This will take a few minutes to get everything initialized, but once its up you will see something like this:
If you've read this far you should have a reasonable understanding of the Apache Airflow layout and be up and running with your own docker dev environment. Well done! This part in the series will cover building an actual simple pipeline in Airflow.
Start building by getting the source code!
The simplest DAG is simply having a list of tasks, where each task depends upon its previous task. If you've spun up the airflow instance and taken a look, it looks like this:
Now, if you're asking why I would choose making an ice cream sundae as my DAG, you may need to reevaluate your priorities.
Generally, if you order ice cream, the lovely deliverer of the ice cream will first as you what kind of cone (or cup, you heathen) you want, then your flavor (or flavors!), what toppings, and then will put them all together into sweet, creamy, cold, deliciousness.
You would accomplish this awesomeness with the following Airflow code:
Briefly, Apache Airflow is a workflow management system (WMS). It groups tasks into analyses, and defines a logical template for when these analyses should be run. Then it gives you all kinds of amazing logging, reporting, and a nice graphical view of your analyses. I'll let you hear it directly from the folks at Apache Airflow
Apache Airflow is a platform to programmatically author, schedule and monitor workflows.
Use airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The airflow scheduler executes your tasks on an array of workers while following the specified dependencies. Rich command line utilities make performing complex surgeries on DAGs a snap. The rich user interface makes it easy to visualize pipelines running in production, monitor progress, and troubleshoot issues when needed.
When workflows are defined as code, they become more maintainable, versionable, testable, and collaborative.
Source - ...
In this part of the series I will cover how to get a nice Apache Airflow instance up and running with docker. You won't need to have anything installed locally besides docker, which is fantastic, because configuring all these pieces individually would be kind of awful!
This is the exact same setup and configuration I use for my own Apache Airflow instances. When I run Apache Airflow in production I don't use Postgres in a docker container, as that is not recommended, but this setup is absolutely perfect for dev and will very closely match your production requirements!
Following along with a blog post is great, but the best way to learn is to just jump in and start building. Get the Apache Airflow Docker Dev Stack here.
Getting an instance Apache Airflow up and running looks very similar to a Celery instance. This is because Airflow uses Celery behind the scenes to execute tasks. Read more...
Every time I want to get started with new tech I figure out how to get a stack up and running that closely resembles a real-world production instance as much as possible.
This is a get up and running post. It does not get into the nitty gritty details of developing with Spark, since I am only just getting comfortable with Spark myself. Mostly I wanted to get up and running, and write a post about some of the issues that came up along the way.
Spark is a distributed computing library with support for Java, Scala, Python, and R. It's what I refer to as a world domination technology, where you want to do lots of computations, and you want to do it fast. You can run computations from the embarrassingly parallel, such as parallelizing a for loop to complex workflows, and support for distributed machine learning as well. You can transparently scale out your computations to not only multiple cores, but even multiple machines by creating a spark cluster....
In Part 1 of this series we went over the Celery Architecture, how to separate out the components in a docker-compose file, and laid the ground for deployment.
This portion of the blog post assumes you have a ssh key setup. If you don't go to the AWS docs here.
AWS CloudFormation is an infrastructure design tool that allows users to design their infrastructure by defining file systems, compute requirements, networking, etc. If you have no interest in designing infrastructure, y0u probably don't need to worry. Cloudformation configurations are shareable through templates.
Docker has come to our rescue here, with a Docker for AWS CloudFormation template. This will, with the click of a few buttons, deploy a docker swarm on AWS for us!!
Click on the page, and scroll down to quick start. Under 'Stable Channel' select '...