I tried to import this dataset in another project:
https://renkulab.io/projects/remko.nijzink/laiobservations/datasets/MODIS_extracted
But somehow renku thinks the dataset is empty. When I look in gitlab, the files are however there. Do you have an idea what went wrong here?
The source dataset is empty: If you run renku dataset ls-files MODIS_extracted
renku shows that has no files. From the commit history I can see that the dataset was created but no files were added to it. Please note that although there are some files in the dataset’s data directory, they are not added to its metadata so renku won’t see them.
To fix the issue you can run renku dataset add MODIS_extracted data/MODIS_extracted/*
in the source project and then reimport it in the other project.
Okay, thank you, but this means that you always have to do that if you create a dataset from another one, correct? I created them with renku run, doesn’t this create a dataset then? And does this keep the lineage in place?
Right, adding/updating datasets’ files must be done explicitly by users. It’s not also possible to automatically add outputs of runs to datasets. We have some plans to add more integration between datasets and workflows (e.g. renku run --input-dataset · Issue #706 · SwissDataScienceCenter/renku-python · GitHub). Linage should be ok.