Unable to run TabbyML with GPU on NixOS or Docker (solved on docker!)

TabbyML is a self-hosted code assistant. I have been unsuccessful at running it using my Nvidia GPU. There’s two ways I’ve tried to deploy this.

As a docker container

Following the docs, it states I run the following docker run command. Below is what I run, modified to use the correct port:

docker run -it --gpus all \
  -p 11029:8080 -v $HOME/.tabby:/data \
  tabbyml/tabby serve --model StarCoder-1B --device cuda

Then I get the following error:

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

So this would appear that I don’t have the “nvidia-container-toolkit” installed on my machine. So I go ahead and enable this in nixos:

hardware.nvidia-container-toolkit.enable = true;

To validate that this works, I should be able to run nvidia-smi from within a container. I can run this from the host without issue:

$ nvidia-smi
Wed Jun  5 08:14:50 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.78                 Driver Version: 550.78         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
...and so on

But if test this from a container, as the nvidia docs suggest as follows, I unable to access it from within the container.

$ sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
docker: Error response from daemon: unknown or invalid runtime name: nvidia.

Okay, so I go and read the instructions further. Install instructions state that after installation, I need to configure the runtime like so:

$ sudo nvidia-ctk runtime configure --runtime=docker
sudo: nvidia-ctk: command not found

Ah nuts. That’s a bug in nixos. I made a PR for this here: https://github.com/NixOS/nixpkgs/pull/317199 Still awaiting results from this. I don’t know if this is a bug that will be backported to 24.05. Regardless, I wouldn’t expect this ad-hoc configuration when I enable the nvidia-container-toolkit option in NixOS. Anyway, this option could still work but with some more time. If you have advice doing this let me know.

FOUND Docker method solution

So looking closer at people with the error message “no such runtime nvidia” I found this thread. It specifies that what nvidia-ctk is supposed to do is add a “runtime” that points to the nvidia-container-runtime executable. So I tried manually adding that my nixos configuration by using the virtualisation.docker.daemon.settings options. I was having trouble getting that working, because I needed to find the exact path to the nvidia-container-runtime executable. If you know Nix, you know that it isn’t just in /usr/bin/.

But that’s still not a satisfying solution anyway…I shouldn’t have to this. I went in deeper and looked at module for nvidia-container-toolkit. This module calls a script called cdi-generate.nix. It outputs the results of nvidia-ctk to a file called nvidia-container-toolkit.json.

Let’s go look for that file…can’t find it. I do more searching…anyway, I found the solution.

The nvidia-container-toolkit is a new option in NixOS 24.05. It explicitly states in the release notes that it is supposed to replace the now deprecated virtualisation.{docker, podman}.enableNvidia options. Well, when you go look at the module that defines docker.enableNvidia you see it there at the bottom! This file actually defines the nvidia runtime!

And yes, it works. Using the now “deprecated” option is the one that actually works. I guess this is another bug to file to NixOS.

This seems to work so far, but I don’t know why the solution using a NixOS module doesn’t work either.

As a NixOS module

Let’s just do it the full NixOS module way (which is what I tried first). That should be easy. Let’s enable the feature and set some options:

services.tabby = {
    enable = true;
    port = 11029;
    acceleration = "cuda";
  };
  networking.firewall.allowedTCPPorts = [ 11029 ];

It appears to be working! VSCodium extension sees the server and prompts for a authentication token. I add the token. I type some code and set for a manual trigger…then tabby dies. Let’'s look at the systemd logs.

tabby[76786]: 📄 Version 0.11.1
tabby[76786]: 🚀 Listening at 0.0.0.0:11029
tabby[76786]:   JWT secret is not set
tabby[76786]:   Tabby server will generate a one-time (non-persisted) JWT secret for the current process.
tabby[76786]:   Please set the TABBY_WEBSERVER_JWT_TOKEN_SECRET environment variable for production usage.
systemd[1]: tabby.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: tabby.service: Failed with result 'exit-code'.
systemd[1]: tabby.service: Consumed 2.285s CPU time, received 121.0K IP traffic, sent 1.6M IP traffic

That’s it. It’s not very descriptive about what happened. I’ve had success running it this way using the “cpu” option for acceleration (no GPU) but that’s too slow to be useful.

GPU specs

I am running a Nvidia RTX 2060 and using the proprietary drivers version 550.

Thanks for the read, if you have any input on what to do next let me know what I can try. Ideally, I’d like to have both options work, since I think the docker implementation may have the same problem as the NixOS module option.