Renku does not recognize conda/pip dependencies

Hi everyone,

after the latest Renku update a couple of days ago I am no longer able to run my programs smoothly.

It looks like certain conda and pip packages (e.g. tensorflow-probability, tensorflow-addons and wandb) that are included in the environment.yml file are invisible to Renku (see screenshot below).

As a consequence, I am unable to make use of said packages e.g. from within Python modules (“ModuleNotFoundError: No module named…”)

Looking at the logs of the Docker image build, it seems that all required packages got installed correctly.

Here are the dockerfile and the environment.yml that I am currently using:

# For finding latest versions of the base image see
# https://github.com/SwissDataScienceCenter/renkulab-docker
# ARG RENKU_BASE_IMAGE=renku/renkulab-py:3.9-0.11.0
ARG RENKU_BASE_IMAGE=renku/renkulab-cuda-tf:11.2-tf-2.7-e19be6f
FROM ${RENKU_BASE_IMAGE}

# Add VSCode support for a more pleasant coding and debugging experience for .py files. More details about issues, comments, and automatically installing extensions on https://renku.discourse.group/t/using-visual-studio-code-in-renkulab-interactive-sessions/249.
RUN curl -s https://raw.githubusercontent.com/SwissDataScienceCenter/renkulab-docker/master/scripts/install-vscode.sh | bash

# Uncomment and adapt if code is to be included in the image
# COPY src /code/src

# Uncomment and adapt if your R or python packages require extra linux (ubuntu) software
# e.g. the following installs apt-utils and vim; each pkg on its own line, all lines
# except for the last end with backslash '\' to continue the RUN line
#
# USER root
# RUN apt-get update && \
#    apt-get install -y --no-install-recommends \
#    apt-utils \
#    vim
# USER ${NB_USER}

# install the python dependencies
COPY requirements.txt environment.yml /tmp/
RUN conda env update -vv -f /tmp/environment.yml && \
    /opt/conda/bin/pip install -r /tmp/requirements.txt && \
    conda clean -y --all && \
    conda env export -n "root"

# Install pyoat library for OA forward transform and backward projections
RUN /opt/conda/bin/pip install git+https://github.com/berkanlafci/pyoat.git

# Install packages that allow to use Tensorboard on Jupyterlab
RUN /opt/conda/bin/pip install git+https://github.com/cliffwoolley/jupyter_tensorboard.git@tb-2.2-compat git+https://github.com/twalcari/jupyterlab_tensorboard.git
RUN /opt/conda/bin/pip install -U tensorboard-plugin-profile

# RENKU_VERSION determines the version of the renku CLI
# that will be used in this image. To find the latest version,
# visit https://pypi.org/project/renku/#history.
ARG RENKU_VERSION=1.5.0

########################################################
# Do not edit this section and do not add anything below

# Install renku from pypi or from github if it's a dev version
RUN if [ -n "$RENKU_VERSION" ] ; then \
        source .renku/venv/bin/activate ; \
        currentversion=$(renku --version) ; \
        if [ "$RENKU_VERSION" != "$currentversion" ] ; then \
            pip uninstall renku -y ; \
            gitversion=$(echo "$RENKU_VERSION" | sed -n "s/^[[:digit:]]\+\.[[:digit:]]\+\.[[:digit:]]\+\(rc[[:digit:]]\+\)*\(\.dev[[:digit:]]\+\)*\(+g\([a-f0-9]\+\)\)*\(+dirty\)*$/\4/p") ; \
            if [ -n "$gitversion" ] ; then \
                pip install --force "git+https://github.com/SwissDataScienceCenter/renku-python.git@$gitversion" ;\
            else \
                pip install --force renku==${RENKU_VERSION} ;\
            fi \
        fi \
    fi

########################################################
name: "base"
channels:
  - defaults
dependencies:
  - deprecated
  - h5py==2.10.0
  - hdf5==1.10.6
  - matplotlib==3.5.1
  - numpy==1.22.2
  - scikit-image
  - scipy
  - tensorflow-probability
  - pip:
    - GitPython
    - scikit-learn
    - tensorflow-addons
    - tf_clahe
    - wandb

prefix: "/opt/conda"

Could you please help me out here?

Many thanks and best regards,
Oliver

Hi @oliverb, your Dockerfile and environment.yml files look good to me. I created a blank project with your exact config to try it, and could not reproduce your issue. From an interactive session, all packages are listed in conda and can be imported from python scripts:
Screenshot from 2022-07-18 13-43-50

Is it possible there may be an additional issue with your project ?

Hi @cmdoret,

thank you for your reply.

In order to pin down possible issues with my project I have done the following: I have created a new branch (let’s call it feature-fix) rooted at the commit that worked fine (in the original branch, which we shall call feature), and subsequently added later commits by cherry-picking. For each set of new commits in feature-fix I have checked whether the above problem would show up, but it did not. Eventually, I got to the point where feature-fix was identical in content to feature, yet the former would work, whereas the latter would not.
Based on the above experiment, I have rebuilt from scratch the docker image of feature by adding the --no-cache flag to the corresponding command (see code snippet of the .gitlab-ci.yml file below)

variables:
  GIT_STRATEGY: fetch
  GIT_SSL_NO_VERIFY: "true"
  GIT_LFS_SKIP_SMUDGE: 1

stages:
  - build

image_build:
  stage: build
  image: docker:stable
  before_script:
    - docker login -u gitlab-ci-token -p $CI_JOB_TOKEN http://$CI_REGISTRY
  script: |
    CI_COMMIT_SHA_7=$(echo $CI_COMMIT_SHA | cut -c1-7)
    docker build --no-cache --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA_7 .
    docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA_7

After that I finally got feature to recognize the required set of conda and pip packages. However, as soon as I removed the above flag and added new commits to feature, the problem would appear again.
It seems that Renku keeps loading stale docker images, provided that these are not rebuilt every time from scratch.

What can be done in order to avoid the above undesired behavior?

Thank you and best regards.

Maybe one of the unpinned libraries (e.g. tensorflow-probability) was temporarily broken at the time the layer was built.

To avoid a faulty layer being recycled I guess you could either:

  1. Manually delete the stale images to force using a newer layer. This can be done by going to the container registry of your gitlab project and manually selecting images to delete. If you consider OK to delete all images, you could also delete the entire registry (called ‘Root image’) and it will be recreated the next time an image is pushed.
  2. Pin a working image in your project (e.g. the working image built using --no-cache). This tells Renku to always use a specific image for new sessions instead of using new builds at every commit. You will need to update the pinned image version if you modify the Dockerfile or environments.txt. See instructions in the corresponding docs section.

Best,
Cyril

The first solution worked fine. Thanks a lot for your help @cmdoret !

1 Like