Guix Introduction Part 5: Per-Project Depedency Management
Contents
This is the fifth part of a brief introduction to the Guix functional package manager, and how it could be used to manage dependencies of projects, much like virtual environments for Python, but with much larger scope.
This time we show a little demo of managing per-project dependencies
with Guix, through the guix time-machine
and guix environment
commands.
Demo settings
Suppose we have two projects ~/guix_demo/proj1
and
~/guix_demo/proj2
, each with different channel file channels.scm
and manifest file pkgs.scm
, so the two projects use different
versions of R (proj1 at R 3.6.3, proj2 at R 4.0.3) and different R
packages (at different versions).
Below we spawn two shells that allow us to work on the two projects at the same time, and we show most of the output. If you try the commands, you should get the same outputs, except possibly the locale output from R which depends on your local setting.
Sample files
You may either manually create the following files, or clone the git repository https://github.com/peterloleungyau/guix_demo created for convenience (you should check that the files are the same as below).
If you are in Guix but somehow do not have git
installed yet, you
may install it and then clone the repository by:
|
|
-
First project
-
~/guix_demo/proj1/channels.scm
:1 2 3 4
(list (channel (name 'guix) (url "https://git.savannah.gnu.org/git/guix.git") (commit "89909327d017198969436237acc7c93823ff8147")))
-
~/guix_demo/proj1/pkgs.scm
:1 2 3 4 5 6 7 8
(specifications->manifest '( ;; R "r" "r-yaml" "r-xgboost" "r-jsonlite" ))
-
~/guix_demo/proj1/fit1.R
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
library(xgboost) data(iris) is_setosa <- as.numeric(iris$Species == "setosa") iris_x <- as.matrix(iris[c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")]) iris_data <- xgb.DMatrix(data = iris_x, label = is_setosa) param <- list(max_depth = 3, eta = 1, nthread = 1, objective = "binary:logistic", eval_metric = "auc") fit <- xgboost(params = param, data = iris_data, nrounds = 10) print(fit)
-
-
Second project
-
~/guix_demo/proj2/channels.scm
1 2 3 4
(list (channel (name 'guix) (url "https://git.savannah.gnu.org/git/guix.git") (commit "9904a15a4c838362673c1affdbaf1e83d92fe8ff")))
-
~/guix_demo/proj2/pkgs.scm
:1 2 3 4 5 6 7
(specifications->manifest '( ;; R "r" "r-glmnet" "r-jsonlite" ))
-
~/guix_demo/proj2/fit2.R
1 2 3 4 5 6 7 8 9 10 11 12 13
library(glmnet) data(iris) is_setosa <- as.numeric(iris$Species == "setosa") iris_x <- as.matrix(iris[c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")]) fit <- glmnet(x = iris_x, y = is_setosa, family = "binomial") summary(fit) print(fit)
-
Open a shell for the first project
First open a new shell with Guix (could be the Guix package manager on a Linux distribution, or the Guix system) to create an environment for proj1:
|
|
Then wait a while for the R prompt to appear. The first time executing the above may take quite some time to download and/or build the packages. Subsequent runs should be much faster after the needed packages are cached.
In the R REPL, we load the few packages and show their versions:
|
|
Open another shell for the second project
Then while keeping the first shell open with the R prompt, we open a new shell, then do the same for proj2:
|
|
In the R REPL, we also load the packages and show the versions:
|
|
You may try running some R commands in both shells to see that they are running at the same time.
Close both shells
Since we directly run R
when creating the guix environment
, so
now quitting R with q()
will also exit the spawn shell. So now we
can exit both shells spawn above.
If you now re-run the above commands, it should be much faster, because the needed packages are cached, and Guix needs only to check that the packages needed are there.
Run R script for the first project
We also run an R script:
|
|
The output:
|
|
Run R script for the second project
We also run an R script:
|
|
The output:
|
|
A possible way to manage dependency using Guix on Jenkins
We briefly discuss using Guix to help manage dependencies for jobs running on Jenkins.
- Basic ideas:
-
the Guix package manager can be installed along with Jenkins on a Linux distribution
-
the Jenkins can be a single master node, or one master node with one or more worker nodes
-
each master or worker node that may need to run jobs needs to have the Guix package manager installed
- if private channels are used, they need to be configured in each master or worker node
-
optionally, one or more nodes can also be a substitution server by running
guix publish
- the Guix in each master or worker node would need to be configured to have this extra substitution server
- if there is one node in the same network used for development (e.g. an VM in AWS Virtual Private Cloud), it is an ideal candidate to serve as the substitution server because during development, the needed packages for a project need to be downloaded or built anyway, so can be shared to other worker nodes when the script is later run in batch mode, to save re-building the packages in the worker nodes.
-
the dependencies of each project (or job) is recorded in the associated git repository, e.g. by having one manifest file
pkgs.scm
, and one channels fileschannels.scm
. -
the Jenkins job can run
guix time-machine
andguix environment
as discussed above:1
guix time-machine -C channels.scm -- environment -m pkgs.scm -- THE_COMMAND_HERE
-
- Benefits:
- due to the reproducibility of Guix, we can be confident that the packages used in development are the exact same versions as in the batch jobs
- the dependencies of each projects are also version controlled in the project git repository.
- this method is hassle-free:
- no need to manually install needed dependencies in needed worker nodes
- will not “forget” to update the dependencies in some worker nodes
- there will not conflicts of dependencies for different
projects, because each job is run in a
guix environment
What’s next?
In this part we showed a little demo of using Guix to manage
per-project dependencies using the guix time-machine
and guix environment
commands, mainly for batch script execution. Next time we
attempt to do the same per-project dependency management when you are
developing locally (not necessarily in Linux) and connecting to a remote
server (or a local VM) with Guix installed.
Author Peter Lo
LastMod 2021-05-13