Dataset created successfully via web but files have 133 bytes

I successfully created a dataset

with files that are >10MB. However, in Renku (both in renkulab jupyterlab and after renku clone) the files have only 133 bytes.

Another dataset I created this way worked well.

Is there anything I am doing wrong or is this unexpected? Thank you for your help.

It looks like the file in the first one is stored as git LFS, meaning you only see the pointer without pulling the files.

In the second repo, the .h5 file is not stored as git LFS, meaning all file is within git, so that one does not require you to pull it explicitly.

1 Like

To simplify pulling the data you can use renku storage pull or tick the box “Automatically fetch LFS data” when you start your session (you need to start from the drop-down menu)
image

You can also make this the default for the project in project settings:

If you use this option, make sure that you launch a session with enough disk space for the data, otherwise your session will crash on launch. It’s always safer to pull data selectively inside the session. If you want to remove LFS files from the local cache (because you need to recover disk space) you can use the renku storage clean command.

Excellent, thank you!