Driver canon pixma ip1880 window 7, Lease termination format india, Gangstar rio city of saints for windows xp, Epson stylus c41sx windows 7 driver, Ios 7 beta 6 legal ipsw iphone 5, Kalavar king telugu movie mp3 songs, Riff for ubuntu 13.04, Diablo 2 lod able items, Cults static zip, I am the bread of life

AWS G2 GPU vs unattended-upgrade

Recently we noticed that our AWS G2 GPU instances were no longer working correctly after a reboot. We were being greeted with the joyful message of

[ 6710.061115] NVRM: The NVIDIA GRID K520 GPU installed in this system is
NVRM:  supported through the NVIDIA 367.xx Legacy drivers. Please
NVRM:  visit http://www.nvidia.com/object/unix.html for more
NVRM:  information.  The 375.39 NVIDIA driver will ignore
NVRM:  this GPU.  Continuing probe…

It was evident that at some point in the past, the NVIDIA driver had been upgraded to a version that now no longer supports the GRID K520 GPU card in the machine. Of course the first thought is to blame whoever had root access on the system. Let’s have a look at /var/log/apt/history.log then…

Start-Date: 2017-03-21  08:51:00
Commandline: /usr/bin/unattended-upgrade
Install: libcuda1-375:amd64 (375.39-0ubuntu0.16.04.1, automatic), nvidia-opencl-icd-375:amd64 (375.39-0ubuntu0.16.04.1, automatic), nvidia-375:amd64 (375.39-0ubuntu0.16.04.1, automatic)
Upgrade: libc6-dev:amd64 (2.23-0ubuntu5, 2.23-0ubuntu6), libcuda1-367:amd64 (367.57-0ubuntu0.16.04.1, 375.39-0ubuntu0.16.04.1), libc6:amd64 (2.23-0ubuntu5, 2.23-0ubuntu6), locales:amd64 (2.23-0ubuntu5, 2.23-0ubuntu6), libc-bin:amd64 (2.23-0ubuntu5, 2.23-0ubuntu6), libc6-
i386:amd64 (2.23-0ubuntu5, 2.23-0ubuntu6), libc-dev-bin:amd64 (2.23-0ubuntu5, 2.23-0ubuntu6), multiarch-support:amd64 (2.23-0ubuntu5, 2.23-0ubuntu6), libfreetype6:amd64 (2.6.1-0.1ubuntu2, 2.6.1-0.1ubuntu2.1), nvidia-opencl-icd-367:amd64 (367.57-0ubuntu0.16.04.1, 375.39-0
ubuntu0.16.04.1), nvidia-367:amd64 (367.57-0ubuntu0.16.04.1, 375.39-0ubuntu0.16.04.1)
End-Date: 2017-03-21  08:52:28

There we go, it was the unattended-upgrade feature of Ubuntu that’s upgrading NVIDIA drivers to an unsupported version for AWS G2 GPU machines.

To fix this, since version 367 of NVIDIA is no longer available in the Ubuntu archives, it has to be obtained as a build artifact. It’s not the cleanest way, but it would seem that the quickest way to resolve this is to apt-get remove nvidia-375, and any dependencies, and then install the build artifacts from https://launchpad.net/~ubuntu-security/+archive/ubuntu/ppa/+build/11078476

Namely,

apt-get remove libcuda1-375 nvidia-opencl-icd-375 nvidia-375 nvidia-cuda-toolkit

wget https://launchpad.net/~ubuntu-security/+archive/ubuntu/ppa/+build/11078476/+files/nvidia-367_367.57-0ubuntu0.16.04.1_amd64.deb

wget https://launchpad.net/~ubuntu-security/+archive/ubuntu/ppa/+build/11078476/+files/nvidia-opencl-icd-367_367.57-0ubuntu0.16.04.1_amd64.deb

wget https://launchpad.net/~ubuntu-security/+archive/ubuntu/ppa/+build/11078476/+files/libcuda1-367_367.57-0ubuntu0.16.04.1_amd64.deb

dpkg -i –auto-deconfigure libcuda1-367_367.57-0ubuntu0.16.04.1_amd64.deb nvidia-opencl-icd-367_367.57-0ubuntu0.16.04.1_amd64.deb nvidia-367_367.57-0ubuntu0.16.04.1_amd64.deb

If there are any lingering versions of a package that depends on nvidia-375, uninstall it, rinse and repeat, and re-install it. It most likely does not depend on -375 explicitly, but a metapackage provided by -375, which we’re providing instead from -365

Once the core -367 packages are installed and happy, check dmesg to make sure the GPU has been discovered, and then reinstall nvidia-cuda-toolkit and any other packages. Assuming all goes well, you can now test your software against the installed package suit.

If things are working as expected, simply mark your now critical packages as held in order to prevent them from being upgraded again

apt-mark hold libcuda1-367 nvidia-367 nvidia-opencl-icd-367

This has been reported to Canonical at https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-375/+bug/1674666