When importing a dataset from one repository to use in a session from another repository, it takes an extremely long time:
In repository A, I made a dataset using the command:
wget http://images.cocodataset.org/zips/val2017.zip unzip val2017.zip renku dataset add --create --move coco val2017 rm val2017.zip renku save -m "added coco dataset"
This already takes quite a while ~1 hour for 5000 files, 77 MB
In a session for repository B, I try importing this dataset:
renku dataset import -y https://limited.renku.ch/datasets/c0700196bc954037994d1c201e5b34c3
This took about 10 hours. The session doesn’t seem to be resource limited, it was started with 2 cores and 16 gigs of ram.
Repeating the process for a 500MB repository has been running for 60 hours now.
These are repositories of standard image datasets that aren’t all that big. What am I doing wrong?