“import torch” fails on the NVIDIA Jetson Nano

NVIDIA provides the Linux4Tegra (L4T) distribution as an image for use with the NVIDIA Jetson Nano. However, once you upgrade the whole system, strange problems will pop up, one of which I have described here: NVIDIA Docker “permission denied: unknown.” on Jetson Nano.

When applying a popular solution described here by adding a new repository to your L4T installation, this will result in interesting error messages such as the following when trying to run L4T-ML containers:

docker run  --rm --runtime nvidia -it nvcr.io/nvidia/l4t-ml:r32.7.1-py3 python3 -c "import torch"

[..]
libcurand.so.10: cannot open shared object file: No such file or directory

It turns out that the nvidia.github.com repository should NOT be used for the Jetson Nano. So installing the package versions from this repository will result in the libraries not being mounted correctly in the container itself.

First, make sure you are actually using the nvidia docker runtime and that docker info | grep nvidia shows the runtime as well. L4T with the nvidia runtime is configured to automatically mount certain folders into the container (this is defined in the CSV files in /etc/nvidia-container-runtime/host-files-for-container.d/). When using an unsupported version, these files are not mounted correctly into the container, which in turn will result in error messages such as the following when trying to use PyTorch:

/usr/lib/aarch64-linux-gnu/libcudnn.so.8: file too short

libcurand.so.10: cannot open shared object file: No such file or directory

To resolve the issue, remove the nvidia.github.com repositories from your L4T installation and downgrade the docker.io and containerd packages as described here.

Hello world

My name is Simon Krenger, I am a Technical Account Manager (TAM) at Red Hat. I advise our customers in using Kubernetes, Containers, Linux and Open Source.

Elsewhere

  1. GitHub
  2. LinkedIn
  3. GitLab