How to track image files used during latex compilation?

Dear all,
I am trying to track the input and output of latex compilations, where I use different figures specified in the latex file. According to Renku Command Line — Renku 0.12.3.dev12 documentation, I could track these figure files in the knowledge graph, if renku run pdflatex paper.tex produced a text file with all figure file paths called .renku/tmp/inputs.txt. Does anyone know how to do this, or any other way to track figure files ending up in a pdf file during latex compilation? Thanks for your help!

One way to do this would be to put all the images you use into a directory and supply the directory using the --input flag. I think this is probably the simplest solution, if you can accept that structure.

The relevant section of the documentation is here: Renku Command Line — Renku 0.12.3.dev12 documentation

A more general solution would be to write a script, make_paper, that parses or scans the input tex file, identifies the image paths, writes those to .renku/tmp/inputs.txt and calls pdflatex. Then you would do renku run make_paper paper.tex.

1 Like

I guess that this would work. One could have a subfolder figures and write any figures that are an output of code executions directly into that folder, so the lineage would be preserved and if any of the input files or code used in the generation of the plots change, renku would ask us to run renku update. Thanks!

I wrote a script that uses paper.tex as input, scans it for the used figures and .bib-file, and generates the pdf and puts it in the defined outputdirectory,like this: renku run bash make_paper.sh src/paper.tex paper_name data/paper/, while also writing .renku/tmp/inputs.txt

The resulting knowledge graph looks good in ascii-format:

*    1d6f235c data/paper/vom.pdf
|\            (part of data/paper directory)
| \
| |\
| | \
| | |\
| | | \
| | | |\
| | | | \
| | | | |\
| | | | | \
| | | | | |\
| | | | | | \
| | | | | | |\
| | | | | | | \
| | | | | | | |\
| | | | | | | | \
| | | | | | | | |\
| | | | | | | | | \
| | | | | | | | | |\
| | | | | | | | | | \
| | | | | | | | | | |\
| | | | | | | | | | | \
| | | | | | | | | | | |\
| | | | | | | | | | | | \
| | | | | | | | | | | | |\
| | | | | | | | | | | | | \
| | | | | | | | | | | | | |\
*---+-+-+-+-+-+-+-+-+-+-+-+-+  1d6f235c data/paper
|/ / / / / / / / / / / / / /
+-+-+-+-+-+-+-+-+-+-+-+-+---*  c56a3f82 .renku/workflow/a9f64df92a6d4fb68108d080f6a3233f_bash.yaml
| | | | | | | | | | | | | |/
@ | | | | | | | | | | | | |  f7ac85ff make_paper.sh
 / / / / / / / / / / / / /
| | | | | | | | | | | * |  0df5b672 img/11_zr_dry.png
| | | | | | | | | | | | |           (part of img directory)
| | | | | | | | | | * | |  0df5b672 img/10_rootdepths.png
| | | | | | | | | | |/ /            (part of img directory)
| | | | | *---------+ /  0df5b672 img/5_gw_sm_watpot.png
| | | | |  / / / / / /            (part of img directory)
| | | | *---------+ /  0df5b672 img/4_fitness.png
| | | |  / / / / / /            (part of img directory)
| *-------------+ /  0df5b672 img/1_map.png
|  / / / / / / / /            (part of img directory)
| | | | | | | | | @  0df5b672 .gitattributes
| *-----------+ |  0df5b672 img/2_vom_scheme.png
|  / / / / / / /            (part of img directory)
| | | | | * / /  0df5b672 img/9_pc_daly.png
| | | | | |/ /            (part of img directory)
| | *-----+ /  0df5b672 img/6_hs_fluxes.png
| |  / / / /            (part of img directory)
| *-----+ /  0df5b672 img/3_model_comparison.png
|  / / / /            (part of img directory)
| *---+ /  0df5b672 img/7_cpcff.png
|  / / /            (part of img directory)
| * / /  0df5b672 img/8_fluxpartitioning.png
| |/ /            (part of img directory)
| @ /  0df5b672 img
|  /
| @  53a428fd VOM_paper.bib
*  6e36c23d src/vom.tex
|           (part of src directory)
@  6e36c23d src

However, this gives an error:
renku log --format dot data/paper/vom.pdf | dot -Tpng > vompaper_kg.png

Traceback (most recent call last):
  File "/home/remko/.local/bin/renku", line 8, in <module>
    sys.exit(cli())
  File "/home/remko/.local/pipx/venvs/renku/lib/python3.6/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/remko/.local/pipx/venvs/renku/lib/python3.6/site-packages/renku/cli/exception_handler.py", line 121, in main
    result = super().main(*args, **kwargs)
  File "/home/remko/.local/pipx/venvs/renku/lib/python3.6/site-packages/renku/cli/exception_handler.py", line 87, in main
    return super().main(*args, **kwargs)
  File "/home/remko/.local/pipx/venvs/renku/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/remko/.local/pipx/venvs/renku/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/remko/.local/pipx/venvs/renku/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/remko/.local/pipx/venvs/renku/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/remko/.local/pipx/venvs/renku/lib/python3.6/site-packages/renku/cli/log.py", line 100, in log
    FORMATS[format](graph, strict=strict)
  File "/home/remko/.local/pipx/venvs/renku/lib/python3.6/site-packages/renku/core/commands/format/graph.py", line 85, in dot
    _rdf2dot_simple(g, sys.stdout, graph=graph)
  File "/home/remko/.local/pipx/venvs/renku/lib/python3.6/site-packages/renku/core/commands/format/graph.py", line 141, in _rdf2dot_simple
    src_path = path_re.match(source).groupdict()
AttributeError: 'NoneType' object has no attribute 'groupdict'
Error: <stdin>: syntax error in line 3

And also online, it says that the dataset is not in the knowledge graph. Do you have any thoughts on what goes wrong?

It sounds like this is a bug in renku-python. Could you open a bug report in Issues · SwissDataScienceCenter/renku-python · GitHub , ideally with a minimal example to reproduce it or a link to the repo and commit this happened at (if the repo is public)?

Glancing at the stack trace, it sounds like a node in the metadata has an id that differs from the id format that the dot command expects for nodes, which shouldn’t be the case.

In the meantime, you could try exporting using the dot-full or dot-debug formats and see if any of those work (they use different code for structuring the resulting graph, so they might not run into this issue). Though that’ll likely result in a graph that looks different from what you’re used to/what you want.

As for the dataset not being in the KG, I’ll ping our KG team to see if they can make sense of it. Here it would help if you could tell us the name or id of the dataset.

Hi. In terms of the knowledge graph not being available online, we are sorry but because of the size of your project generating data for the knowledge graph takes quite some time. You can observe the progress of the process by looking at the Status page of your project.

Thanks! Seems I was too impatient:

With the dot-debug I get also a graph, but really complex and too big to put here.

dot-landscape gives a similar error, I’ll create an issue for it.

This looks great, but I don’t understand how this is generated from the command

How do all the .png files get into the knowledge graph? And why are there several parallel instances of bash make_paper.sh in the KG? Could you explain a bit more, also for others that might find this useful?

The script scans the paper for the images, and adds them to .renku/inputs.txt, as was suggested above. But yes, I think something is still a bit wrong with the knowledge graph, the multiple make_paper.sh shouldn’t be there.

I have the same issue when running a notebook more than once, I get a script block for each run I made even if the input and outputs remain the same.

Thanks for raising the question. So there’s a bug in the version we currently have on renkulab.io that simply makes almost the whole workflow to be displayed for a user. Thankfully, we fixed that already so the next release should correct the behaviour.