Docker

Mark Edmondson

2019-05-04

Dockerfiles

googleComputeEngineR has a lot of integration with Docker, using it to launch custom pre-made images via the gce_vm_container and gce_vm_template commands.

Get an overview on how to install containers on Google Compute Engine here.

Use Dockerfiles to create the VM you want to run within, including R packages you want to install. As an example, this is a Dockerfile designed to install R packages for a Shiny app:

FROM rocker/shiny
MAINTAINER Mark Edmondson (r@sunholo.com)

# install R package dependencies
RUN apt-get update && apt-get install -y \
    libssl-dev \
    ## clean up
    && apt-get clean \ 
    && rm -rf /var/lib/apt/lists/ \ 
    && rm -rf /tmp/downloaded_packages/ /tmp/*.rds
    
## Install packages from CRAN
RUN install2.r --error \ 
    -r 'http://cran.rstudio.com' \
    googleAuthR \
    && Rscript -e "devtools::install_github(c('MarkEdmondson1234/googleID')" \
    ## clean up
    && rm -rf /tmp/downloaded_packages/ /tmp/*.rds

## assume shiny app is in build folder /shiny
COPY ./shiny/ /srv/shiny-server/myapp/

The COPY command copies from a folder in the same location as the Dockerfile, and then places it within the /srv/shiny-server/ folder which is the default location for Shiny apps. This location means that the Shiny app will be avialable at xxx.xxx.xxx.xxx/myapp/

The example Dockerfile above installs googleAuthR from CRAN, googleID from GitHub and a Debian dependency for googleAuthR that is needed, libssl-dev via apt-get. Modify this for your own needs.

Google Container Registry

Google Cloud comes with a private container registry where you can store public or private docker containers. It is distinct from the more usual Docker hosted hub, where most public Docker images sit.

Container names usually come in the format gcr.io/your-project-id/your-image-id

You can use this directly or create the correct name for a hosted image via gce_tag_container() - by default it uses the project you are in, but change the project name if necessary, for example for the public images available in gcer-public at this URL: https://console.cloud.google.com/gcr/images/gcer-public?project=gcer-public

Access and push to your registry via the gce_pull_registry() and gce_push_registry() functions.

You can use this to save a custom image with your specific dependencies, which can then also be used as a base for other custom images.

Build Triggers

Build triggers are a feature of the Google Container Registry, which lets you build the docker image when you push to a public or private Git repository such as GitHub or Bitbucket.

A typical workflow is to construct your Dockerfile locally, push to say GitHub which triggers a build, then use that build when launching a VM via the dynamic_image argument in gce_vm().

This lets you use version control on your R environments, and provides a useful way to always know exactly what dependencies are running. The VMs built using your dynamic image can be stopped with their Docker versions intact, and you can try out new development versions by launching another VM.

Build Trigger Setup example

  1. The example Dockerfile is a simple one that uses a Tidyverse Docker image (this has RStudio as well) and installs secret. Customise this to your needs, perhaps by creating your own Dockerfile with help from containerit
FROM rocker/tidyverse

# Install secret
RUN install2.r --error \
        secret 
  1. The Dockerfile is pushed to GitHub (this example is in the googleComputeEngineR GitHub here)
  2. Login to the Google Cloud console and visit Build triggers, and make a trigger with these settings:

Note the image name, folder to build from and that the tag is set to latest.

This should then shown in the build trigger console like this:

This is in the public googleComputeEngineR image project, but should look similar for your own.

  1. We now launch the VM with the custom image. Since it is an RStudio based template, we define template to be “rstudio” and supply a username and password. If it was a Shiny based Dockerfile, the template would be changed to “shiny”, etc.
library(googleComputeEngineR)
## auto auth messages
## set default project ID to xxxxxx
## set default zone to xxxxx

vm <- gce_vm("im-rstudio", 
              predefined_type = "n1-standard-1", 
              template = "rstudio", 
              username = "test", 
              password = "test1234", 
              dynamic_image = gce_tag_container("test-trigger", project = "gcer-public"))

Note I change the project to the public one of the image, as its different from the paid project I put the VM into, but in your case it may be the same.

The Dockerfile name is given by the gce_tag_container() function.

gce_tag_container("test-trigger", project = "gcer-public")
# [1] "gcr.io/gcer-public/test-trigger"

You can use this function or pass in the image name directly as listed in the Build trigger console.

  1. After a few mins visit the I.P. of the instance which should have RStudio running with the secret library installed:

Public Docker images

The FROM field in the Dockerfile could be a previously made image you or someone else has already created, allowing you to layer on top. The first example above is available via a public Google Continer Registry called gcer-public, made for this purpose, which you can see here: https://console.cloud.google.com/gcr/images/gcer-public?project=gcer-public