I am using renkulab for a class and students asked me if they have to create a new environment every time, which I thought should not be the case as long as they don’t remove the environment they created. However, I found that even the environment I created for the demo is gone. Do environments get removed automatically now? It takes quite a while to create a new one, and it probably drains resources, so I wonder if there is a way to avoid environments from being removed. Thanks for your help!
hi @schymans,
thanks for your interest on Renku and for your question.
Environments on Renkulab are configured to stop automatically after 24 hours of inactivity.
Therefore, the way to avoid an environment to be killed is to keep it active. In some browsers, having an open tab would suffice, otherwise performing an action on the notebook every 24 hours should ensure the environment is not cleaned up.
Best,
Pamela
And regarding the part of your question not addressed by Pamela, the reason it takes a while to create an environment the first time is that the docker image has to be built. But, once built, it is cached, and for most projects, future image builds are instantaneous*.
*There are cases where the changes to a project can make it look like an image needs to be rebuilt, when it in fact is not necessary. If you have one of those situations, there is a feature that lets you “pin” the image used for a project. This can be especially useful in classes were everyone is going to be using the same image. With a pinned image, it only needs to be built once and is shared by everyone (as opposed to built once for each user).
Thanks to both of you! I didn’t realise the docker images are cached even if the environment disappears from the list. How does the pinning work?
Oh, and another question: Each environment is built based on a specific commit, but then we can push additional commits from that environment. If we stop the environment and re-connect, will our new commits still be there, or do we need to pull them first? I guess that whenever we create a new environment, even if it is based on a cached docker image, it would use the most recent commit as a base.Right?
Wow, I just tried it out. If I stop an environment with commited but not pushed changes, and then create a new one, I get a message about this, with an option to continue from there or reset.
For pinning a docker image to a project please check this post:
As for the additional commits, after they are pushed you have the option to start a new environment from the new commit (after the image has built successfully or immediately if its pinned).
You can always choose which commit/branch you want to start your environment from (see screenshot below, drop-down menu), but you can only start one environment per commit. It is also possible to pull the latest commits in an environment that started with an old commit.
Also note that the autosave feature is there as a backup, but we still encourage regular pushing back to the origin.
I have had the same questions/comments in my courses.
If you are using Renku with RStudio, you can add the following in the install.R
file to get a Save to RenkuLab addin
devtools::install_git("https://renkulab.io/gitlab/cchoirat/renkur.git")
library(renkur)
It makes it easy for the students to commit and push from the GUI and make sure they save all the changes over the period of a term.
Thanks, but we use jupyter. What does the addin do? I asked them to commit and click on the cloud icon to push periodically.
It’s a way to commit and push directly from the RStudio GUI.
Hi @pameladelgado, thanks again for your tip!
My colleage @oscar just told me that he has another problem, as soon as connectivity is lost due to network outage, computation in a jupyter notebook is stopped, which is bad for longer computations. Do you know of a way around this? Thanks a lot!
hi @schymans,
to answer your question shortly, at the moment there’s no way of recovering work that was not pushed when a network outage happens. However, we are working on a feature that would allow to recover temporary local changes saved on disk in an environment.
Having said that, if the longer computations are atomic (i.e. can’t save/recover from an intermediate state), I’m afraid the only tip I have is to optimize when possible the length of the computations since the behavior would be the same as working locally/in a VM.
I’m curious about the experience your colleague had, as far as we know, we haven’t had a network outage recently on renkulab.io, can you or @oscar maybe provide more details on this?
Best
If I’m understanding correctly the problem is simply that the terminal is closed and therefore the child command is stopped. You could install and use tmux in the interactive session to detach the process from the terminal. It would still terminate after 24 hours of inactivity, however. These are “interactive” sessions afterall, i.e. not meant for long-running jobs We are working on ways of supporting these kinds of use-cases though, hopefully in the next few months so stay tuned.
Thanks, @pameladelgado. Unfortunately, I cannot provide details about the issues the students were having, but both @oscar and @rcnijzink might have similar examples, as they also run long computations.