|
# Orin Board Setup and Installation
|
|
# Orin Board Setup and Installation
|
|
For the Orin board and getting the proper drivers/packages I took the following steps to get the board set up.
|
|
For the Orin board and getting the proper drivers/packages I took the following steps to get the board set up.
|
|
## System Flashing
|
|
## System Flashing
|
|
Using a Ubuntu 22.04 host machine I installed the Nvidia SDK manager and flashed the Orin board with the latest ORIN board SDK image. For what we are doing I only installed the cuda drivers and runtime/utils which is the minimal amount of packages in the flash. You will need to pay attention to what version is being installed for the later steps. The following websites were used as instuctions for the SDK manager. Most if not all instructions were followed from them.
|
|
Using a Ubuntu 22.04 host machine I installed the Nvidia SDK manager and flashed the Orin board with the latest ORIN board SDK image. It is important to clarify that the flash must be done on a host machine that is natively running Ubuntu. We have tried on VMs/Docker but both are unable to complete the flash process. For what we are doing I only installed the CUDA drivers and runtime/utils which is the minimal amount of packages in the flash. You will need to pay attention to what version is being installed for the later steps. The following websites were used as instuctions for the SDK manager. Most if not all instructions were followed from them.
|
|
|
|
|
|
You will need a USB-C to USB-A serial cable to complete the flashing process. The instuctions on how to set this up are also found in the provided links.
|
|
You will need a USB-C to USB-A serial cable to complete the flashing process. The instuctions on how to set this up are also found in the provided links.
|
|
|
|
|
... | @@ -10,30 +10,104 @@ https://developer.nvidia.com/embedded/learn/jetson-agx-orin-devkit-user-guide/tw |
... | @@ -10,30 +10,104 @@ https://developer.nvidia.com/embedded/learn/jetson-agx-orin-devkit-user-guide/tw |
|
|
|
|
|
NOTE: For flashing you will need to put the Orin board into recovery mode which is also mentioned in the how-to-install web link.
|
|
NOTE: For flashing you will need to put the Orin board into recovery mode which is also mentioned in the how-to-install web link.
|
|
## Post Flash
|
|
## Post Flash
|
|
Post flash you will need to confirm that CUDA and its respective drivers are loaded and up to-date and that CUDA can run. The current version runs CUDA 12.2 and its respective driver CUDART. You can check this by downloading the cuda samples from NVIDIA.
|
|
Post flash you will need to confirm that CUDA and its respective drivers are loaded and up to-date and that CUDA can run. The current version runs CUDA 12.2 and its respective driver CUDART. You can check this by downloading the CUDA samples from NVIDIA or by going to where CUDA was installed and run the devicequery executable included with the install.
|
|
NOTE: Make sure you download the cuda-samples repository that matches the cuda version you are running. This is done changing GIT tag to the correct version.
|
|
NOTE: Make sure you download the cuda-samples repository that matches the cuda version you are running. This is done changing GIT tag to the correct version.
|
|
|
|
|
|
## Mellonox Drivers - MLNX OFED
|
|
## Mellanox Drivers - MLNX OFED
|
|
For the Mellonox PCIE card you will need to install its repective drivers specific for the aarch64 achitecture and tegra kernel. For this all information found at https://gitlab.ras.byu.edu/alpaca/wiki/-/wikis/Unix-Networking under Installing MLNX OFED still applys with some minor adjustments to the commands. The driver ISO file can be found at https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/. As for the change in commands you will need to run the following as sudo with the NIC card installed after the image is mounted:
|
|
For the Mellanox PCIE card you will need to install its respective drivers specific for the aarch64 architecture and tegra kernel. For this all information found at https://gitlab.ras.byu.edu/alpaca/wiki/-/wikis/Unix-Networking under Installing MLNX OFED still applys with some minor adjustments to the commands. The driver ISO file can be found at https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/. For the University Node the you will need to install the 23.10-3.2.2.0 version of the driver. The drop down menu should include this version. As for the change in commands you will need to run the following as sudo with the NIC card installed after the image is mounted:
|
|
|
|
|
|
`mount -o ro, loop <.iso> /mnt`
|
|
`mount -o ro, loop <.iso> /mnt`
|
|
`cd /mnt/mlnxofedinstall`
|
|
`cd /mnt/`
|
|
`./mlnxofedinstall --without-dkms --add-kernel-support --kernel <kernel version> --without-fw-update --force --enable-gds`
|
|
`./mlnxofedinstall --without-dkms --add-kernel-support --kernel <kernel version> --without-fw-update --force --enable-gds`
|
|
|
|
|
|
If at any point this command fails you'll have to debug why it didn't complete. It will most likly be due to conflicting system packages. The `--force` command should resolve this but if not take caution in removing other packages as CUDA might depend on some of them.
|
|
If at any point this command fails you'll have to debug why it didn't complete. It will most likely be due to conflicting system packages. The `--force` command should resolve this but if not take caution in removing other packages as CUDA might depend on some of them.
|
|
|
|
|
|
After the command states that it has succesfully installed the drivers and MLNX_OFED dependencies you should reboot the system.
|
|
After the command states that it has successfully installed the drivers and MLNX_OFED dependencies you should reboot the system.
|
|
|
|
|
|
## Post install
|
|
## Post Driver Install
|
|
After reboot you should check if everything started correctly and the MLNX NIC is recognized with its respective ports do this by running the following commandline tools:
|
|
After reboot you should check if everything started correctly and the MLNX NIC is recognized with its respective ports. This can be done by running the following command-line tools:
|
|
|
|
|
|
`sudo dpkg -l | grep -i mlnx`
|
|
`sudo dpkg -l | grep -i mlnx`
|
|
`sudo ibv_devinfo`
|
|
|
|
`sudo lsmod | grep mlx`
|
|
`sudo lsmod | grep mlx`
|
|
|
|
`sudo ibv_devinfo`
|
|
|
|
|
|
These commands will check:
|
|
These commands will check:
|
|
1. If the Mellanox packages were installed
|
|
1. If the Mellanox packages were installed
|
|
2. If the infiniband libary/devices are working
|
|
2. If the infiniband libary/devices are working
|
|
3. If the kernel modules are present.
|
|
3. If the kernel modules are present.
|
|
|
|
|
|
If all is well and you got to this point, you should be ready to test the device. |
|
If the ibv_devinfo command returns no devices and you see that the kernel modules are indeed installed and running, you might need to reinstall the NIC on the PCIE slot and reboot.
|
|
\ No newline at end of file |
|
|
|
|
|
## Post Driver Install
|
|
|
|
|
|
|
|
From this point the NIC should be addressable for a network. To have the device setup for the university node and data streaming several system services need to be installed.
|
|
|
|
|
|
|
|
# 'network-config' service
|
|
|
|
|
|
|
|
For this service we will configure the NIC. This configuration can be changed to best fit the users needs. Instructions on configuration however is not provided in this document.
|
|
|
|
|
|
|
|
To create the service run the following:
|
|
|
|
`sudo touch /etc/systemd/system/network-config.service`
|
|
|
|
Then with your favorite text editor enter the following into that file that you just created:
|
|
|
|
```
|
|
|
|
[Unit]
|
|
|
|
Description=Network Configuration Service
|
|
|
|
After=network.target
|
|
|
|
|
|
|
|
[Service]
|
|
|
|
Type=oneshot
|
|
|
|
ExecStart=/bin/bash /usr/local/bin/network-config.sh
|
|
|
|
RemainAfterExit=yes
|
|
|
|
|
|
|
|
[Install]
|
|
|
|
WantedBy=multi-user.target
|
|
|
|
```
|
|
|
|
Next create the network-config.sh script that will be ran by the service by doing the following:
|
|
|
|
`sudo touch /usr/local/bin/network-config.sh`
|
|
|
|
In your favorite text editor again,
|
|
|
|
```
|
|
|
|
#!/bin/bash
|
|
|
|
|
|
|
|
ip address add 192.168.2.100/24 dev eth0
|
|
|
|
ip link set eth0 mtu 9000
|
|
|
|
echo 32 > /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
|
|
|
|
ethtool -K eth0 tso off
|
|
|
|
ethtool -K eth0 gro on
|
|
|
|
ethtool -G eth0 rx 8192
|
|
|
|
ethtool -K eth0 lro on
|
|
|
|
ethtool -C eth0 rx-usecs 0 rx-frames 128
|
|
|
|
ethtool -K eth0 receive-hashing off
|
|
|
|
ethtool -K eth0 rx-udp-gro-forwarding on
|
|
|
|
sysctl -w net.core.wmem_max=67108864
|
|
|
|
```
|
|
|
|
Then finally the following commands in order:
|
|
|
|
```
|
|
|
|
sudo chmod +x /usr/local/bin/network-config.sh
|
|
|
|
sudo systemctl daemon-reload
|
|
|
|
sudo systemctl enable network-config.service
|
|
|
|
sudo systemctl start network-config.service
|
|
|
|
```
|
|
|
|
After all the of the following has been completed the NIC will be configured, and will continue to be configured anytime the system is rebooted or turned off.
|
|
|
|
|
|
|
|
# Boot configurations
|
|
|
|
|
|
|
|
In order for the data pipeline to run we need to have isolated cpus and HugeTLB pages for correct memory allocation. For this to happen we need to change the boot parameters of the system. To do this do the following:
|
|
|
|
|
|
|
|
With your text editor open the following file `/boot/extlinux/extlinux.conf`. When the file is open follow the instructions given in the comments for creating a back up and create a new entry above the comments that is the following:
|
|
|
|
|
|
|
|
```
|
|
|
|
LABEL primary
|
|
|
|
MENU LABEL primary kernel
|
|
|
|
LINUX /boot/Image
|
|
|
|
INITRD /boot/initrd
|
|
|
|
APPEND ${cbootargs} root=PARTUUID=6d40c020-58ba-48ec-ac31-606f57963602 rw rootwait rootfstype=ext4 mminit_loglevel=4 console=ttyTCU0,115200 console=ttyAMA0,115200 firmware_class.path=/etc/firmware fbcon=map:0 net.ifnames=0 nospectre_bhb video=efifb:off console=tty0 nv-auto-config isolcpus=0 default_hugepagesz=1G hugepagesz=1G iommu=on
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
Save then reboot the system. Post reboot you will want to check whether this change took effect by looking at what the command line actually returned at boot. You will check this by entering the following in the command-line:
|
|
|
|
|
|
|
|
`cat /proc/cmdline`
|
|
|
|
|
|
|
|
If this doesn't return what is placed in the APPEND key above then one of the previous steps was done correctly. If it does match this process is complete.
|
|
|
|
|
|
|
|
|