... | @@ -98,9 +98,11 @@ NOTE: Make sure you download the cuda-samples repository that matches the cuda v |
... | @@ -98,9 +98,11 @@ NOTE: Make sure you download the cuda-samples repository that matches the cuda v |
|
## Mellanox Drivers - MLNX OFED
|
|
## Mellanox Drivers - MLNX OFED
|
|
For the Mellanox PCIE card you will need to install its respective drivers specific for the aarch64 architecture and tegra kernel. For this all information found at https://gitlab.ras.byu.edu/alpaca/wiki/-/wikis/Unix-Networking under Installing MLNX OFED still applys with some minor adjustments to the commands. The driver ISO file can be found at https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/. For the University Node the you will need to install the 23.10-3.2.2.0 version of the driver. The drop down menu should include this version. As for the change in commands you will need to run the following as sudo with the NIC card installed after the image is mounted:
|
|
For the Mellanox PCIE card you will need to install its respective drivers specific for the aarch64 architecture and tegra kernel. For this all information found at https://gitlab.ras.byu.edu/alpaca/wiki/-/wikis/Unix-Networking under Installing MLNX OFED still applys with some minor adjustments to the commands. The driver ISO file can be found at https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/. For the University Node the you will need to install the 23.10-3.2.2.0 version of the driver. The drop down menu should include this version. As for the change in commands you will need to run the following as sudo with the NIC card installed after the image is mounted:
|
|
|
|
|
|
`mount -o ro, loop <.iso> /mnt`.
|
|
```
|
|
`cd /mnt/`.
|
|
mount -o ro, loop <.iso> /mnt
|
|
`./mlnxofedinstall --without-dkms --add-kernel-support --kernel <kernel version> --without-fw-update --force --enable-gds`.
|
|
cd /mnt/
|
|
|
|
./mlnxofedinstall --without-dkms --add-kernel-support --kernel <kernel version> --without-fw-update --force --enable-gds
|
|
|
|
```
|
|
|
|
|
|
If at any point this command fails you'll have to debug why it didn't complete. It will most likely be due to conflicting system packages. The `--force` command should resolve this but if not take caution in removing other packages as CUDA might depend on some of them.
|
|
If at any point this command fails you'll have to debug why it didn't complete. It will most likely be due to conflicting system packages. The `--force` command should resolve this but if not take caution in removing other packages as CUDA might depend on some of them.
|
|
|
|
|
... | @@ -193,3 +195,31 @@ Save then reboot the system. Post reboot you will want to check whether this cha |
... | @@ -193,3 +195,31 @@ Save then reboot the system. Post reboot you will want to check whether this cha |
|
If this doesn't return what is placed in the APPEND key above then one of the previous steps was done correctly. If it does match this process is complete.
|
|
If this doesn't return what is placed in the APPEND key above then one of the previous steps was done correctly. If it does match this process is complete.
|
|
|
|
|
|
|
|
|
|
|
|
## Hashpipe
|
|
|
|
|
|
|
|
Hashpipe is the pipeline firmware running on the ORIN which enables the data acquisition. For the university node the shared libraries and executables are given. The following section shows how to start Hashpipe for data acquisition.
|
|
|
|
|
|
|
|
### Starting Hashpipe
|
|
|
|
To run Hashpipe you must be su. Along with being su, you will need to set the LD_LIBRARY_PATH environment variable to the correct shared libraries required to run the executable.
|
|
|
|
|
|
|
|
As su change directory to the bin/ directory and run the following on the command line:
|
|
|
|
```
|
|
|
|
export LD_LIBRARY_PATH=/<hashpipe_root_dir>/lib/aarch64-linux-gnu
|
|
|
|
|
|
|
|
```
|
|
|
|
To ensure that nothing is wrong it is recommended to always use an absolute path.
|
|
|
|
|
|
|
|
Once the environment variable is set you can run Hashpipe. The only command that needs to run in the following:
|
|
|
|
```
|
|
|
|
./hashpipe -p libtest_rx_hashpipe.so -I 0 -o IBVPKTSZ=8298 -o IBVIFACE=eth0 -o IBVSNIFF=1 -o DATADIR=<path_for_data> -o FILENAM=<file_name>.bin -c 0 ibvpkt_thread -c 1 disk_async_t
|
|
|
|
```
|
|
|
|
The only thing that should be changed in this command is what is inside the '<>'. Datadir is the path to where the data will be written and filename is the name of the file that will be written.
|
|
|
|
|
|
|
|
### Monitoring Hashpipe
|
|
|
|
|
|
|
|
To see what Hashpipe it will be useful to check specific control values that are happening in the pipeline such as the data rate or amount of disk rights that have occurred. In order to monitor Hashpipe, in a separate terminal from the same bin/ directory you will enter the following command:
|
|
|
|
|
|
|
|
```
|
|
|
|
watch -n 1 "./hashpipe_check_status -I 0 -v | fold -w 80"
|
|
|
|
|
|
|
|
``` |