Renku run no workflow recorded

Hello,
In this public project, I am trying to create a workflow using renku run command.
In particular, I am running the following:

renku run --name demo -- bash download-oadat-mini.sh; python train_segmentation_demo.py --task_str="seg_swfd_lv128,sc" --datasets_parent_dir="data/OADAT-mini" --logdir="trained_models/seg_swfd_lv128,sc"; python eval_segmentation_demo.py --task_str="seg_swfd_lv128,sc" --datasets_parent_dir="data/OADAT-mini" --fname_out="trained_models/seg_swfd_lv128,sc/eval.p"

This properly executes the three lines of scripts that I would like it to do (1. download files, 2. run some NN optimization, 3. compute predictions and save in a pickle file). However, I do not see any output from renku itself. Similarly renku workflow ls returns and empty list.

I am not sure what the expected behavior should be, but probably not this. Can you have a look? It takes a few mins to run.
I am mainly suspecting some renku version mismatch or so from my Dockerfile, but otherwise have no idea.

Hi Firat,

One issue that I see with your command is that renku run doesn’t support multiple commands (i.e. semicolons, pipes, …). So, when you execute this, only the first command is tracked by Renku.

To overcome this limitation, you need to create a script and put multiple commands in it.

I couldn’t run this command to completion due to some missing libraries, but the first part still created a demo plan in the project:

$ renku workflow ls
ID                                       NAME    COMMAND
---------------------------------------  ------  ---------------------------
/plans/d36db7098aee4c8893cf0f9373482db9  demo    bash download-oadat-mini.sh

This should happen in your project because Renku stores the plan after reaching the semicolon and the rest is executed by shell and won’t be tracked by Renku.

Let me know if this changed anything.

Cheers,
Mohammad

Thanks!
I put the script into another bash file as follows
demo.sh:

#!/bin/bash
bash download-oadat-mini.sh
python train_segmentation_demo.py --task_str="seg_swfd_lv128,sc" --datasets_parent_dir="data/OADAT-mini" --logdir="trained_models/seg_swfd_lv128,sc"
python eval_segmentation_demo.py --task_str="seg_swfd_lv128,sc" --datasets_parent_dir="data/OADAT-mini" --fname_out="trained_models/seg_swfd_lv128,sc/eval.p"

When I run
renku run --name demo demo.sh
or
renku run --name demo -- demo.sh
I get the error

Error: Invalid parameter value - Cannot execute command ‘’: This is likely because the executable doesn’t exist or has the wrong permissions set.

can you comment on it?

Why not do

$ renku run bash download-oadat-mini.sh
$ renku run python train_segmentation_demo.py --task_str="seg_swfd_lv128,sc" --datasets_parent_dir="data/OADAT-mini" --logdir="trained_models/seg_swfd_lv128,sc"
$ renku run python eval_segmentation_demo.py --task_str="seg_swfd_lv128,sc" --datasets_parent_dir="data/OADAT-mini" --fname_out="trained_models/seg_swfd_lv128,sc/eval.p"

?

As for the demo script, it probably doesn’t have execute permission you probably need to do chmod u+x demo.sh and run it with renku run --name demo ./demo.sh

But personally, I would split this up into three steps.

This worked, for the most part.
Now I can see IDs for the 2nd and 3rd lines, but nothing for the first line (renku run bash download-oadat-mini.sh) under renku workflow ls

How big is this file, is it not easier to just add it to the project? :slight_smile:

Downloaded files for this demo is 14.5GB. However, the download script can be slightly changed to download full dataset for a local renku session, in which case, the data files would be ~800GB. So I would rather not.
On the other hand, I know the downloaded files will always be the same, because the host will keep the files as is at least for 10 years, so a checksum on downloaded files should remain the same.

Can you let me know the exact script you used to get this output? For me, I am not seeing bash download-oadat-mini.sh in workflow ls, but only the other 2 lines