Zenodo Code release / dataset tag automation

I was wondering about creating DOIs associated with Renku projects and datasets in a fairly nicely streamlined way as github does with Zenodo integrations and I read a few of the threads here discussing this question.

In this thread:
Is it possible to publish code through Renku? - #3 by jen-thomas @jen-thomas made some good points about separating data and code. Except in cases where it is useful to bundle example data for illustrative and testing purposes as @sebastian-landwehr observed.

In this post @cmdoret promoted the dataset tags feature: Dataset versioning on Renku

I did a little searching regarding linking gitlab instances to Zenodo I was not sure how/if as similar thing
to github could/had been done in gitlab, especially in instances other than the main gitlab.io one. I found this in the documentation of the eOSSR Python library. (Open-source Scientific Software and Service Repository (OSSR)):

https://escape2020.pages.in2p3.fr/wp3/eossr/gitlab_to_zenodo.html

It seems that they have made use of gitlab CI/CD and Zenodo’s API to automate Zenodo DOI creation when a new release of a project in made in a gitlab instance.
I’m not sure if you’d need to make use of their library if you wanted to emulate this as
this post: Template for exporting data to zenodo? made me aware that the Renku CLI client can apparently generate Zenodo JSON to prepare the project for use with Zenodo (though the renku docs link included on this is dead).

I thought that this feature of generating Zenodo DOIs from gitlab releases might be able to be extended to include dataset tags and an automation(s) created as part of a Renku project template(s) to create new Zenodo DOIs for code on releases and datasets on tagging.

Is this something people think might be a good idea?

Hi Richard,

Thanks for the very thoughtful idea and for writing it up! It is definitely something that we are thinking of in order to work towards FAIR principles. It gives us more evidence that this would be a useful feature for a wider audience of researchers who may want to expose their projects on Renku in various forms, for example to Zenodo or to create DOIs.

Watch this space!

Cheers

Hi Gavin,

I’m very probably going to be deploying a Renku instance for the research consortium I work for and making use of the Renkulab.io instance to teach some of my colleagues the platform. So, be forewarned your likely to be seeing a lot more of me on the forum for the foreseeable future. There are a few features - citable data and code snapshots being one of them, that are very important for our use case. I don’t know Renku’s tech stack that well yet but I’m keen to work on getting any features that we’d like for our use case upstreamed or available in community contributed templates etc. obviously only if the Renku team also wants any of the stuff I work on.

1 Like

I recently became aware of RO-crate Research Object Crate (RO-Crate) - Research Object Crate (RO-Crate) a metadata model for packaging research datasets and detailing their analysis steps in a linked data format (JSON-LD). As well as RoHub https://reliance.rohub.org/ where they are shared. This looks like it might be an interesting output for Renku to support at some point as it has more explicit support for including computational workflows, whereas zenodo is more focused on data/code.

Re eOSSR tool I mentioned previously - I took a closer look at that and concluded it’s easier to just add zenodo compatible JSON and make some simple calls to the Zenodo API from gitlab CI/CD triggered by tagging a commit to generate a DOI unless you are contributing directly to OSSR.

Thank you for bringing this up!

We do have two issues on our side for exporting projects and exporting lineage , along with a long discussion, which isn’t dead but unfortunately it’s not at the top of our priorities right now.

But we’d be more than happy for user feedback/suggestions on those issues, so we can implement it right once it does get picked up.

Ah thanks for linking those I’ll take a look at conversations around those issues.