Tracking .py file created in repository

Kriegelw · 4 January 2021 21:57

I am currently working in a repository where I create a .py file with equations from a dataset, and tracking it to that point works.
I then import the equations from this .py file in another notebook, where I use it in combination with another dataset to create a graphical output. When looking at the workflow, the dataset input is there but the .py is not.
Is there a way to track importing internal .py files like this?

mohammad-sdsc · 5 January 2021 08:58

Hi @ Kriegelw,

You can define those files as explicit inputs to your command and renku will mark them as dependencies: renku run --input path/to/equation-file.py .... See https://renku-python.readthedocs.io/en/stable/commands.html#detecting-input-paths for more details.

It’s also possible to define those explicit inputs in your script using renku API: https://renku-python.readthedocs.io/en/stable/api.html. If you are not sure what approach to chose, then use command line argument for the moment since it does not require modifying your scripts.

Kind regards,
Mohammad

schymans · 6 January 2021 13:39

Hi @kriegelw,
You could use papermill, e.g.:

renku run papermill notebooks/notebook.ipynb \
notebooks/notebook.ran.ipynb -p importfile definition.py

whereas notebook.ipynb uses the following code to import definition.py:

with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    mod = importlib.import_module(basepath+importfile[:-3])
names = getattr(mod, '__all__', [n for n in dir(mod) if not n.startswith('_')])
g = globals()
for name in names:
    g[name] = getattr(mod, name)

Topic		Replies	Views
Renku Python API	6	511	9 July 2021
How to trace interactive measurements?	2	274	2 December 2020
Import renku dataset in a notebook / script Renku (CLI)	3	393	24 April 2020
Add code from different project	20	761	22 May 2024
Renku run papermill Renku (CLI)	3	370	29 April 2022

Tracking .py file created in repository

Related topics