--- title: "Creating a Singularity Container to Run HuggingFace Transformers Models in R" output: github_document #rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Singularity_Container} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` [Singularity](https://apptainer.org/) is a container engine alternative to Docker. Singularity containers are well suited for the requirements of High Performance Computing (HPC) workloads. A container contains all code as well as *all* its dependencies so that the an application runs reliably on different computers (or different computing environments). It can be used to run on servers or as a way to ensure computational reproducibility (that the code run on other systems, and in the future). For an introduction to the concept of containers see [Computational Reproducibility via Containers in Psychology](https://open.lnu.se/index.php/metapsychology/article/view/892/). Below is code to build a Singularity container for setting up transformers language models from HuggingFace and running the ```text```-package. ## Code to build a singularity container with HuggingFace models in R ``` Bootstrap: docker From: ubuntu:20.04 %environment export LANG=C.UTF-8 LC_ALL=C.UTF-8 export XDG_RUNTIME_DIR=/tmp/.run_$(uuidgen) %post # Install apt-get -y update export R_VERSION=4.2.2 echo "export R_VERSION=${R_VERSION}" >> $SINGULARITY_ENVIRONMENT # Install R apt-get update apt-get install -y --no-install-recommends software-properties-common dirmngr wget uuid-runtime wget -qO- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc | \ tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc add-apt-repository \ "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/" apt-get install -y --no-install-recommends \ r-base=${R_VERSION}* \ r-base-core=${R_VERSION}* \ r-base-dev=${R_VERSION}* \ r-recommended=${R_VERSION}* \ r-base-html=${R_VERSION}* \ r-doc-html=${R_VERSION}* \ libcurl4-openssl-dev \ libharfbuzz-dev \ libfribidi-dev \ libgit2-dev \ libxml2-dev \ libfontconfig1-dev \ libssl-dev \ libxml2-dev \ libfreetype6-dev \ libpng-dev \ libtiff5-dev \ libjpeg-dev # Add a default CRAN mirror echo "options(repos = c(CRAN = 'https://cran.rstudio.com/'), download.file.method = 'libcurl')" >> /usr/lib/R/etc/Rprofile.site # Fix R package libpaths (helps RStudio Server find the right directories) mkdir -p /usr/lib64/R/etc echo "R_LIBS_USER='/usr/lib64/R/library'" >> /usr/lib64/R/etc/Renviron echo "R_LIBS_SITE='${R_PACKAGE_DIR}'" >> /usr/lib64/R/etc/Renviron # Clean up rm -rf /var/lib/apt/lists/* # Install python3 apt-get -y install python3 wget apt-get -y clean # Install Miniconda cd / wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh -b -p /miniconda /bin/bash <