My Deep Learning Setup

After enrolling in Coursera’s Deep Learning Specialization and Fast.AI’s Deep Learning for Coders (Part I) I decided to build my own deep learning setup. A lot of my setup came from various web sites and was motivated by other tech setups, but it was not collected neatly in one place. So here is how I set up my deep learning server.

Overview

I’m running Ubuntu 16.04 with Anaconda environments for R, TensorFlow, PyTorch, and Fast.AI. These are run in Jupyter notebook, whose port is forwarded to my laptop using autossh and dynamic DNS. Using the nb_conda_kernels package, I can open notebooks in different environments using the same Jupyter session. Jupyter is open in a tmux session so it stays alive when the ssh connection is dropped.

For hardware you can check out the PCPartPicker setup I used. Many parts were reused or bought used, bringing the total cost to about $1000. I think I over-bought on RAM (32GB is not necessary) and will be looking to add a several-terabyte hard drive soon. On top of that, the motherboard does not have wireless internet or an HDMI port. Other than that I think it’s a decent setup – GTX 1080 is certainly good for my needs, and I’ve had no problems.

Specifics

OS Setup

I essentially followed Kitty Schum’s Towards Data Science guide. Here are my steps:

Installed Ubuntu 16.04, because it’s still a long-term-support version, and it was better supported by CUDA in July when I built the server¹. I didn’t switch out the graphics card at all – just switched which port my monitor was plugged into, and which graphics card the system BIOS was using.
Install drivers for your GPU. First, run lspci | grep nvidia to get a version number (for me it was 390). Now install this driver using
```
 sudo add-apt-repository ppa:graphics-drivers
 sudo apt-get update
 sudo apt-get install nvidia-390
```
Install the Nvidia toolkit sudo apt-get install nvidia-cuda-toolkit and run nvcc to make sure everything is running well.
Open the Fast.AI install script for Paperspace in a text file. Run it line by line, as needed. The main changed I made were
- Don’t automatically activate the fastai environment.
- Install sudo apt-get install cuda=9.0.176-1 instead of sudo apt-get install cuda.
You’ll have Anaconda installed at the end of this step. I also ran pip install fastai in the fastai environment, to access these files outside of their github directory.
Make a new Anaconda environment called tensorflow and install the GPU version of Tensorflow.
Make a new environment for R, and install R. Install tidyverse. Make sure ir-kernel is installed, as this allows you to open R notebooks in Jupyter.
Make a new environment for PyTorch and install it.
Install SSH so you can finish this guide on another machine:
```
 sudo apt-get install openssh-server
```
Install nb_conda_kernels to switch kernels within the same Jupyter session. conda install nb_conda_kernels

Great! Now you’ve got a base system. Let’s install some additional tools.

Remote access

I prefer to work in coffee shops and libraries, instead of at home or at my office. As such, it’s very important to have a system that works remotely with minimal effort.

Remote Logins

You should now be able to log on to this computer using one of the commands

ssh my_username@my_hostname
ssh my_username@my_local_ip_address

Two problems with this are:

You need to log on outside of your network.
IP addresses typically change, and even when they are static they are hard to remember.

First, you’ll need to open your router’s settings and open the firewall to allow SSH connection². Once I could ssh from a different network³, I registered for a domain on No-IP, and installed the No-IP client. Now you should be able to access your machine using

ssh my_username@my.no-ip.domain

Great! Let’s make it even easier by editing the file ~/.ssh/config to be

Host GPU_server
    HostName my.no-ip.domain
    User my_username

And now you should be able to log in via

ssh GPU_server

It still asks for your password, but you can set it up to log in without a password.

At this point, you can run jupyter notebook --port 8889 on the server, and ssh -N -f -L localhost:8888:localhost:8889 GPU_server on the client. Now going to localhost:8888 on your web browser should lead you to a Jupyter notebook!

Never explicitly log in to Jupyter

I like to use tmux and autossh to access my Jupyter notebooks.

tmux is a terminal multiplexer. I like it because usually when an ssh connection is lost, all running processes die. With tmux the session stays active, and you can reconnect later (or just let it run in the background). Install this on the server using sudo apt-get install tmux.
autossh checks for a lost ssh connection, and reconnects when necessary. Install this on your client computer.

Start a new tmux session on the server:

tmux new -s my_sess
jupyter notebook --port 8889

Add the following to your client’s ~/.bashrc file:

autossh -M 0 -N -f -L localhost:8888:localhost:8889 GPU_server

Add the following to your server’s ~/.bashrc file:

jupyter notebook --port 8889

Restart both computers. Now, open localhost:8888 in your browser and you should see your Jupyter root directory.

Closing thoughts

Maybe there’s nothing exceptionally new here, but these are some tricks that took me a while to figure out.

This may still be the case. ↩
Google how to do this. ↩
I tested by connecting to a hotspot on my phone ↩

Andrew Maurer