II. Building OpenMPI and nVidia GPU Driver
Currently the state-of-the art CPU has an architecture of several cores. Furthermore, most personal computers for multimedia applications possess GPU cards, which are also suitable for parallel computing. So why not use it effectively to establish a hybrid cpu-gpu parallel computing facility.
OpenMPI is one of the leading MPI-2 implementations. The OpenMPI project has the stated aim of building the best Message Passing Interface (MPI) library available.
(A) Prepare dependences and configure gnu compilers
1. Install and upgrade the necessary files
$ sudo add-apt-repository ppa:ubuntu-toolchain-r/test
$ sudo apt-get update && sudo apt-get dist-upgrade
$ sudo apt-get install --upgrade build-essential autotools-dev autoconf automake
2. Install gcc and gfortran compilers
$ sudo apt-get install g++-5 gfortran-5
$ sudo apt-get install g++-6 gfortran-6
$ dpkg -l | grep gcc | awk '{print $2}'
--> gcc, gcc-5, gcc-5-base:amd64; gcc-6, gcc-6-base:amd64; gir1.2-packagekitglib-1.0, libcaca0:amd64, libgcc-5-dev:amd64, libgcc-6-dev:amd64, libgcc1:amd64, libpackagekit-glib2-16:amd64, libunity-action-qt1:amd64, libwebrtc-audio-processing-0:amd64, qtchooser
qtdeclarative5-unity-action-plugin:amd64, qtdeclarative5-unity-action-plugin:amd64
$ dpkg -l | grep g++ | awk '{print $2}'
--> g++, g++-5, g++-6
$ dpkg -l | grep gfortran | awk '{print $2}'
--> gfortran, gfortran-5, gfortran-6, libgfortran-5-dev:amd64, libgfortran-6-dev:amd64, libgfortran3:amd64
3. Configure gcc and gfortran compilers
$ sudo update-alternatives --remove-all gcc
$ sudo update-alternatives --remove-all g++
$ sudo update-alternatives --remove-all cc
$ sudo update-alternatives --remove-all c++
$ sudo update-alternatives --remove-all gfortran
$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 10
$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-6 20
$ sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 30
$ sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-6 40
$ sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 50
$ sudo update-alternatives --set cc /usr/bin/gcc
$ sudo update-alternatives --install /usr/bin/c++ c++ /usr/bin/g++ 60
$ sudo update-alternatives --set c++ /usr/bin/g++
$ sudo update-alternatives --config gcc
$ sudo update-alternatives --config g++
$ sudo update-alternatives --install /usr/bin/gfortran gfortran /usr/bin/gfortran-5 70
$ sudo update-alternatives --install /usr/bin/gfortran gfortran /usr/bin/gfortran-6 80
$ sudo update-alternatives --config gfortran
$ sudo apt-get install csh menu x11proto-print-dev motif-clients
$ sudo apt-get install freeglut3-dev libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev libgl1-mesa-dri libcurl-ocaml-dev libcurl4-gnutls-dev gnuplot grace
$ sudo apt-get autoremove
(B) Install OpenMPI
1. Grab the necessary file from the web
$ wget http://www.open-mpi.org/software/ompi/v1.10/downloads/openmpi-1.10.2.tar.gz
$ tar xzvf openmpi-1.10.2.tar.gz
$ cd openmpi-1.10.2
2. Building OpenMPI-1.10.2 :with Cuda-aware support:
$ ./configure --prefix="/usr/local/openmpi" --enable-orterun-prefix-by-default CC=gcc CXX=g++ F90=gfortran FC=gfortran --enable-static --with-cuda
Build using maximum number of physical cores
$ n=`cat /proc/cpuinfo | grep "cpu cores" | uniq | awk '{print $NF}'`
$ make -j $n
$ sudo make install
$ make distclean
3. Setup path in Environment Variable in /etc/environment:
export PATH="/usr/local/openmpi/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/openmpi/lib:$LD_LIBRARY_PATH"
Quit and save.
4. Test OpenMPI
(a) Prepare a c-program like this , which is a parallel version of c-program for calculating the summation of inverse factorials.
(b) Compile the program with:
$ mpicc testOMPI.c -o testOMPI
(c) Test the code on the dell-m4800 node:
$ mpirun -H dell-m4800 -np 4 testOMPI 10000000 -->
Result: 2.7182818285
Time elapsed: 16.366959ms
(d) Test the code on the remote dell-t7500 node:
$ mpirun -H dell-t7500 -np 4 testOMPI 10000000 -->
Result: 2.7182818285
Time elapsed: 22.429943ms
(e) Test the code on the two nodes:
$ mpirun --mca btl_tcp_if_include "10.42.0.0/16" -H dell-m4800,dell-t7500 -np 8 testOMPI 10000000 -->
Result: 2.7182818285
Time elapsed: 13.143063ms
(C) Install nVidia GPU Driver and Cuda-7.5 SDK
1. Check the VGA hardware information
$ lspci | grep VGA -->
01:00.0 VGA compatible controller: NVIDIA Corporation GK106GLM [Quadro K2100M] (rev a1)
2. Install nVidia GPU driver and Cuda-7.5:
(a) Clean up all pre-installed nVidia-drivers
$ sudo apt-get purge "nvidia-*" "libcuda1-*"
$ sudo apt-get autoremove | sudo apt-get update
$ sudo apt-get install dkms build-essential linux-headers-generic
$ sudo add-apt-repository --remove ppa:xorg-edgers/ppa
(b) Successful installment of nvidia-driver depends on the linux kernel. Thus, as you update linux kernel, you may need to re-install nvidia-driver.
First, check your current linux kernel
$ uname -r --> 4.4.0-24-generic
(c) Install and Check Nvidia Graphic Driver.
Install Nvidia-361 driver on Ubuntu 16.04
I). Launch Web Browser and go to http://www.ubuntuupdates.org/package/core/xenial/restricted/proposed/nvidia-352
Hit "APT INSTALL" red button and follow the instructions
II). Check the installed nvidia graph card driver
$ cd /usr/lib
$ ls -lh | grep nvidia
-rw-r--r-- 1 root root 1.7M Thursday 5 22:35 libnvidia-gtk2.so.361.42
-rw-r--r-- 1 root root 1.7M Thursday 5 22:35 libnvidia-gtk3.so.361.42
lrwxrwxrwx 1 root root 53 Saturday 2 10:37 libvdpau_nvidia.so -> /etc/alternatives/x86_64-linux-gnu_libvdpau_nvidia.so
drwxr-xr-x 2 root root 4.0K Saturday 1 15:58 nvidia
drwxr-xr-x 6 root root 4.0K Saturday 2 10:37 nvidia-361
drwxr-xr-x 2 root root 4.0K Saturday 2 10:37 nvidia-361-prime
III). Reboot
$ sudo shutdown -r now
IV). Check the nvidia setting.
Use CLI to ask which card driver in use:
$ prime-select query
nvidia
Select nvidia driver:
$ sudo prime-select nvidia
Info: the current GL alternatives in use are: ['nvidia-361', 'nvidia-361']
Info: the current EGL alternatives in use are: ['nvidia-361', 'nvidia-361']
Info: the nvidia profile is already in use.
(d) Once your NVIDIA driver is set, install the CUDA 7.5 Debian package:
I).Prepare clean environment for installing Cuda
$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo shutdown -r now
II).After reboot, install relevant dependences:
$ sudo apt-get install ca-certificates-java default-jre default-jre-headless fonts-dejavu-extra freeglut3 freeglut3-dev java-common libatk-wrapper-java libatk-wrapper-java-jni libdrm-dev libgl1-mesa-dev libglu1-mesa-dev libgnomevfs2-0 libgnomevfs2-common libice-dev libpthread-stubs0-dev libsctp1 libsm-dev libx11-dev libx11-doc libx11-xcb-dev libxau-dev libxcb-dri2-0-dev libxcb-dri3-dev libxcb-glx0-dev libxcb-present-dev libxcb-randr0-dev libxcb-render0-dev libxcb-shape0-dev libxcb-sync-dev libxcb-xfixes0-dev libxcb1-dev libxdamage-dev libxdmcp-dev libxext-dev libxfixes-dev libxi-dev libxmu-dev libxmu-headers libxshmfence-dev libxt-dev libxxf86vm-dev lksctp-tools mesa-common-dev x11proto-damage-dev x11proto-dri2-dev x11proto-fixes-dev x11proto-gl-dev x11proto-input-dev x11proto-kb-dev x11proto-xext-dev x11proto-xf86vidmode-dev xorg-sgml-doctools xtrans-dev libgles2-mesa-dev nvidia-modprobe build-essential
III). Download "cuda_7.5.18_linux.run" from https://developer.nvidia.com/cuda-downloads
On ubuntu 16.04. it is necessary to install the CUDA manually from the ".run" file. Dependencies that would normally get pulled in from doing a CUDA install will have to be added.
$ cd ~/Downloads
$ sudo bash cuda_7.5.18_linux.run --override
The "--override" is needed so you don't get the error, "Toolkit: Installation Failed. Using unsupported Compiler."
--> Do you accept the previously read EULA? (accept/decline/quit): accept
You are attempting to install on an unsupported configuration. Do you wish to continue? ((y)es/(n)o) [ default is no ]: y
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 352.39? ((y)es/(n)o/(q)uit): n
Install the CUDA 7.5 Toolkit? ((y)es/(n)o/(q)uit): y
Enter Toolkit Location [ default is /usr/local/cuda-7.5 ]:
Do you want to install a symbolic link at /usr/local/cuda? ((y)es/(n)o/(q)uit): y
Install the CUDA 7.5 Samples? ((y)es/(n)o/(q)uit): y
Enter CUDA Samples Location [ default is /home/kinghorn ]: /usr/local/cuda-7.5
Installing the CUDA Toolkit in /usr/local/cuda-7.5 ...
Finished copying samples.
= Summary =
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-7.5
Samples: Installed in /usr/local/cuda-7.5
Add a couple of system files.
$ sudo nano /etc/profile.d/cuda.sh
Add 'export PATH=$PATH:/usr/local/cuda/bin' into the file and save it. This allows cuda.sh to be executed when you login.
$ sudo nano /etc/ld.so.conf.d/cuda.conf
Add '/usr/local/cuda/lib64' into the file and save it. Run:
$ sudo ldconfig
Note: gcc-6 will not work with nvcc. Force cuda to work with gcc 5.
$ sudo nano /usr/local/cuda/include/host_config.h
Go to line: 115 and comment out error
// #error -- unsupported GNU version! gcc versions later than 4.9 are not supported!
$ sudo nano /usr/local/cuda/samples/common/findgllib.mk
Go to line: 61, changed nvidia-352 to nvidia-361:
UBUNTU_PKG_NAME = "nvidia-361"
$ sudo nano /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h
comment out: // #include
3. Test the Cuda-7.5 SDK
(a) Copy cuda programs to the user directory and compile the programs:
$ cp -a /usr/local/cuda/samples ~/cuda_samples
$ cd ~/cuda_samples/
Specify GLPATH=/usr/lib on the make line: GLPATH=/usr/lib make
$ GLPATH=/usr/lib make -j $n
(b) After compilation, test the programs with
$ cd ~/cuda_samples/bin/x86_64/linux/release/
$ ./deviceQuery
$ ./nbody -benchmark -numbodies=128000
--> Compute 3.0 CUDA device: [Quadro K2100M]
number of bodies = 128000
128000 bodies, total time for 10 iterations: 9678.305 ms
= 16.929 billion interactions per second
= 338.572 single-precision GFLOP/s at 20 flops per interaction
$ ./particles -benchmark
--> grid: 64 x 64 x 64 = 262144 cells
particles: 16384
Run 16384 particles simulation for 300 iterations...
particles, Throughput = 8682.0585 KParticles/s, Time = 0.00189 s, Size = 16384 particles, NumDevsUsed = 1, Workgroup = 0
|