Launching "A Python Renku Project" stops at "requesting token for provider internal_gitlab"

@diarmuidcire is the ip address found the same in both cases you tested above? I.e. after 136.xxx.xxx.xxx?

Yes the response is the same - if thats what you mean??

Yes. That is what I mean.

I am honestly not sure how to proceed here. The only other suggestion I can offer is to jump on a call with you together and just take a look at the deployment and the configuration.

It is not safe to run with 443 open in your network policy.

If you agree we can find a time that works for both of us and go over this in a zoom meeting.

Hi Tolevski. A Zoom would be great. When can we arrange?

Just debugging this a bit more. Im pretty sure its not a DNS issue as I testing DNS lookup and dig using PIP tools, I got the expected IP address response. These were containers created in Renku (with the netpol TCP 443 disabled), but also tested in the renku-amalthea-6d8bc7fc55-p5c6d -n renku -c amalthea pod (with the netpol TCP 443 disabled). No dns issues.

However, I did notice this “INTERNAL_GITLAB”. But my gitlab is external to renku. Maybe thats something to do with the problem?

2024-11-12T17:00:07.910072751Z INFO:root:Requesting token for provider INTERNAL_GITLAB

The git-clone is failing.
Warning BackOff 3m10s (x58 over 28m) kubelet Back-off restarting failed container git-clone in pod diarmuid-2ec-proj-2ddns-29e32d9c-0_renku(f90ac3f2-c69b-4205-983e-d94cd61d050d

    - name: GIT_CLONE_GIT_PROVIDERS_0_
      value: '{"id": "INTERNAL_GITLAB", "access_token_url": "https://renku.computing.xx.xx/api/auth/gitlab/exchange"}'
    image: renku/git-clone:1.27.1

in renku values I have

global:
  gateway: 
  ...
  gitlab:
    clientSecret:  xxx
    registry:
      host: registry.computing.xxx.ie
    url: gitlab.computing.xxx.ie

...
gitlab:
  enabled: false

@diarmuidcire I will message you in a private message to arrange a time and exchange info.

We call this internal gitlab because renku has tighter integration with it. It does not have to be deployed with renku or be in the same namespace or cluster. But Renku will log you into this gitlab when you log into Renku itself.

Anyhow lets look more at this over zoom. I dont have any more ideas from what you described above for what you can try before we meet.

@diarmuidcire I sent you a direct message on discourse. Hopefully you can see it.

In preparation for our meeting, I have reinstalled the dns service “coredns”

I have the exact same problem again unfortunately.

However, I have made some progress, and somewhere to focus. Here are some of the debug coredns logs:

[INFO] 192.168.9.166:37061 - 18054 "A IN renku-jena-master.renku.svc.cluster.local. udp 59 false 512" NOERROR qr,aa,rd 116 0.000376121s
[INFO] 192.168.3.135:58110 - 59107 "AAAA IN renku.xx.xx.xx.renku.svc.cluster.local. udp 64 false 512" NXDOMAIN qr,aa,rd 157 0.000204855s
[INFO] 192.168.3.135:58110 - 51352 "A IN renku.xx.xx.xx.renku.svc.cluster.local. udp 64 false 512" NXDOMAIN qr,aa,rd 157 0.000257137s
[INFO] 192.168.3.135:37131 - 19251 "A IN renku.xx.xx.xx.svc.cluster.local. udp 58 false 512" NXDOMAIN qr,aa,rd 151 0.000128699s
[INFO] 192.168.3.135:37131 - 63286 "AAAA IN renku.xx.xx.xx.svc.cluster.local. udp 58 false 512" NXDOMAIN qr,aa,rd 151 0.000195312s
[INFO] 192.168.3.135:56653 - 16388 "A IN renku.xx.xx.xx.cluster.local. udp 54 false 512" NXDOMAIN qr,aa,rd 147 0.000098371s

I dont get any response from the renku.xx.xx.xx.renku.svc.cluster.local or renku.svc.cluster.local from the DNS.

This one might be wrong: renku.xx.xx.xx.renku.svc.cluster.local. basically renku.my-domain.renku.svc.cluster.local
Maybe it should be just renku.svc.cluster.local?? But renku.svc.cluster.local does not reply with an IP from the DNS as you will see below.

kubectl run -i --tty --rm dns-test --image=busybox --namespace=renku --restart=Never -- nslookup renku.xx.xx.xx
Server:		10.96.0.10
Address:	10.96.0.10:53


Name:	renku.xx.xx.xx
Address: 136.xx.xx.xx

pod "dns-test" deleted

nslookup renku-redis-node-2.renku-redis-headless.renku.svc.cluster.local.

gpulab@soc-gpulab:~/soc-gpulab/renku/bin$ kubectl run -i --tty --rm dns-test --image=busybox --namespace=renku --restart=Never -- nslookup renku-redis-node-2.renku-redis-headless.renku.svc.cluster.local.
Server:		10.96.0.10
Address:	10.96.0.10:53


Name:	renku-redis-node-2.renku-redis-headless.renku.svc.cluster.local
Address: 192.168.35.230

pod "dns-test" deleted

nslookup renku.svc.cluster.local

gpulab@soc-gpulab:~/soc-gpulab/renku/bin$ kubectl run -i --tty --rm dns-test --image=busybox --namespace=renku --restart=Never -- nslookup renku.svc.cluster.local
Server:		10.96.0.10
Address:	10.96.0.10:53



pod "dns-test" deleted

Basically it seems to be an internal routing issue to me.