40

R, Docker and Checkpoint: A Route to Reproducibility

 4 years ago
source link: https://www.tuicool.com/articles/QzqE3ez
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

I need to deploy Shiny on a Windows machine. I also need to use {checkpoint} for package management. Using Docker seems to be the only reasonable approach to Shiny on Windows. But how easy would it be to also factor {checkpoint} into this setup?

Only one reasonable way to find out: give it a try.

Below is the simple Dockerfile I used. Here are the fundamental components of what it does:

  • Derived from the R-3.6.1 image from rocker .
  • Create an environment variable CHECKPOINT_DATE with the snapshot date for {checkpoint} .
  • Install the {checkpoint} package.
  • Make a snapshot folder for {checkpoint} .
  • Add commands to .Rprofile which will load {checkpoint} and select the required snapshot.
  • Install a sample package under {checkpoint} .
FROM rocker/r-ver:3.6.1

ENV CHECKPOINT_DATE 2018-12-01

RUN R -e "install.packages('checkpoint')" && \
    mkdir -p /root/.checkpoint/${CHECKPOINT_DATE} && \
    echo "library(checkpoint); checkpoint('${CHECKPOINT_DATE}', scanForPackages = FALSE);" >~/.Rprofile && \
    R -e "install.packages('colorspace')"

After building the Docker image you’re ready to give this a whirl. There are (at least) two ways that you could use this:

{checkpoint}

Packages on the Image

Let’s look at the option of installing packages on the image first. The Dockerfile above already installs the {colorspace} package. Let’s test that out.

In the screen shot below the top panel shows the launch of the image and successful loading of the {colorspace} package. The lower panel connects a shell to the running container and lists the contents of the snapshot folder to confirm that the {colorspace} package is there.

U36ZrmZ.png!web

Packages on the Host

If you share a snapshot folder from the host with the container then you get a lot more flexibility.

In the screen shot below the top panel shows the launch of the image, where the ~/.checkpoint folder on the host is shared with the container. Now it’s possible to select any of the snapshots present on the host. For example, rather than choosing the 2018-12-01 snapshot installed on the image, we can now select the 2019-06-01 snapshot from the host.

raYnaeA.png!web

Of these two options the latter seems like a more flexible solution. If, however, your aim is to provide a Docker image with a complete (and reproducible) computational environment, then the former is definitely the way to go: it’s less flexible but the package versions are all locked down on the image.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK