17/06/2022 Troubles on installing TensorFlow GPU
Background
So when I was doing my masters project, using Compressive Sensing algorithms to speed up image acquisation, I came across this interesting concept of using Machine Learning to reconstruct images (instead of the algorithm we used, TVAL3. I probably should write about it) from compressed signals. My master supervisor Simon like the idea and I don't want to just sit at home applying for jobs for the summer, there it goes.
The reconstruction framework used in Machine Learning reconstruction is called NetFlics (I know very funny). It is a Convolutional Neural Network model trained with simulated images. The framework is built on Tensorflow Keras library, so first step is to get Tensorflow running on my local environment.
Had I considered doing everything in Google Colab? Yes, it is actually quite a bit easier to start with. There's nothing easier than deploying a tensorflow model on Google Colab, until you have a model that takes a few hours to run and Google Colab shuts you down.
But that's not the primary reason. When I was still developing in the lab, Simon wouldn't stop talking about his monstrous Lab computer that has a very fast Quadro GPU that can do massive parallel computations. He seemed really excited to finally able to utilise his monster with my project (Parallelised Computation to speed up reconstruciton hoping to utilise CUDA) only for the project to end with CPU parallelisation. But hell, why not use this monster machine that can run the model 24/7?
Luckily, my game computer has a GTX1060, which supports CUDA. I can get learn to set up the environment in my home computer before I mess with the lab one. So it begins.
Implementation
Attemp on pip
The first instinct is of course install it with pip because pip rules. But before I start there are a few dependencies for CUDA that needs to be sorted.
According to the official document:
https://www.tensorflow.org/install/pip#windows
I need
A CUDA GPU ✔
Windows 7 or higher ✔
VC++ with VS 2015, 2017 and 2019 (Note VS 2022 wound't work, it came to hunt me in the ass)
GPU driver 450.80.02 or higher ✔
CUDA toolkit (Spoil alert, version not forward compatible)
cuDNN SDK (Spoil alert, version not forward compatible)
These dependencies can be quite easily installed. Out of my natural tendency (and the misleading installation website that defaults you into installing the newest release), I insalled the latest versions of VS, CUDA toolkit and cuDNN SDK. VS and CUDA toolkit is easy to install, you download a installer .exe and press next until its done. cuDNN is essentially a library, you put the .dll files into respective folders in CUDA toolkit ( why not just make it an optional component in the installer...).
Now I've got the dependencies and ready to go, I can install tensorflow now.
Before I did that, I had to clean up my dev environment. For various reasons, there are quite a few python versions on my computer and everything is everywhere. I deleted python 2.7 (hello highschool me) and 3.5, and installed the newest 3.10. This seems a bit boring information but again will come back and hunt me.
But for now, all there is left to do is my favorite pip install
pip3 install tensorflow-gpu
After downloading some 3GB of things, it is on my computer!
Let me try to run my Trumpyter locally and see how it goes.
Something went wrong that I can't remember what.
But I do remember the cause, it's the versions of my CUDA toolkit that causes me problems
https://www.tensorflow.org/install/source_windows#gpu
Uninstall the newer versions and install the old versions fixed things up. I have a working Tensorflow environment now and I have mastered installing it ✔
While I'm at it, I got Pytorch for GPU as well, that was rather smooth.
But now my python dev environment has gotten quite crowded with things I don't always need. I started looking for package managers that lets me only load what I need, conda does exactly that.
Attempt on conda
To be continued......