Adding GPUs to Renku

Hi everyone,

Hope you are well.

I have successfully deployed Renku on my Kubernetes cluster with two nodes. However, I am unable to add my GPUs to Renku, as shown in the attached screenshot.

Could you kindly help me resolve this issue? I’d greatly appreciate your support.

@ALQATF have you added the GPUs on the nodes in your k8s cluster? And are they accessible to k8s workloads running on those nodes? If so then you can just set whatever combination of node affinities and taints in the resource classes for a resource pool and taints on the nodes in k8s to make certain resource classes schedule only on certain nodes in your cluster.

1 Like

The documentation around how to create resource pools or classes that use specific nodes is still missing.

But the hardest part is usually making GPUs be usable by Kubernetes pods running in the cluster. And for this we cannot provide documentation because it depends on too many things (i.e. the gpu hardware, the hardware that runs the cluster, the Kubernetes version, cloud or bare metal, permissions, networking setup, etc). But if you have added the GPUs to nodes in your cluster and you have confirmed they are indeed usable by Kubernetes pods, then you can simply taint those nodes in k8s, then add the same toleration to a specific resource class in renku and also add an affinity for those nodes to the same class.

1 Like

The GPUs are detected in the k8s cluster, as you can see in the attached screenshot @tolevski

@ALQATF I think you should also try to run a simple Pod or Deployment that tries to use those gpus on one of the nodes. To confirm all the drivers are on the nodes and all that. If that does not work then running stuff on Renku will also not work.

Once you are certain the GPUs are usable from withing a K8s pod or deployment, you need to do the following:

  1. Make your Renku account an administrator
  2. Log out and back in
  3. Go to the admin panel and create the resource pools and classes
  4. Assign users that can access those resource pools you just created

To make yourself an admin do the following:

  1. Navigate to https://<where-renku-is-installed>/auth
  2. Log in as Keycloak admin with the username admin and the password can be found in a Kubernetes secret named keycloak-password-secret
  3. Change the Keycloak realm to Renku
  4. Go to the users page and find your own user
  5. Assign the renku-admin role to the user
  6. Log out of the keycloak admin panel
  7. Log out of Renku
  8. Log back in
  9. If you click on the profile outline in the upper right corner after logging in then you can see an Admin panel option where you will be able to create and manage resource pools.
1 Like

We have definitely had cases where kubectl describe nodes shows that the nodes have gpus but then you could not use or see the gpus from Pods running on those nodes.

1 Like

Ok, I will do this. Thank you so much.

1 Like

@tolevski It works for me, thank you so much—I honestly don’t know how to thank you enough!

1 Like

That is awesome @ALQATF!