Apply an account using the signup link. On the “Person Information” panel, click “Join Existing Project” and fill in our project name “AOS-UFMG-DCC831”. Your application will be approved by us shortly.
We are going to use m510 machines equipped with Eight-core Intel Xeon D-1548 2.0 GHz CPU, 64GB ECC Memory, 256 GB NVMe flash storage, and Dual-port Mellanox ConnectX-3 10 Gbps NIC. You can find the detailed hardware description and how the machines are interconnected here and current availability here. Based on our experience, m510 machines have ample availability most of the time but start the assignment early to avoid missing the deadline due to the availability issue.
Mellanox ConnectX-3 requires MLX4 poll mode drive library (librte_pmd_mlx4) to poll the packets directly from the NIC with DPDK. See the detailed document here. To enable the mlx4 driver, you first need to install the Mellanox OFED (OpenFabrics Enterprise Distribution) on the machines.
3. $ sudo apt-get update
4. $ sudo apt-get install libnuma-dev libnl-3-dev libnl-route-3-dev
6. $ wget http://content.mellanox.com/ofed/MLNX_OFED-4.6-1.0.1.1/MLNX_OFED_LINUX-4.6-1.0.1.1-ubuntu18.04-x86_64.tgz
8. $ tar -xvzf MLNX_OFED_LINUX-4.6-1.0.1.1-ubuntu18.04-x86_64.tgz
9. $ cd MLNX_OFED_LINUX-4.6-1.0.1.1-ubuntu18.04-x86_64
10.$ sudo ./mlnxofedinstall --upstream-libs --dpdk
12.$ sudo /etc/init.d/openibd restart
14.$ ibv_devinfo
If it is installed successfully,
you can see
that two ports are available on the machine with ibv_devinfo
. In m510 machine, the first port
(port 0 in DPDK) is used for public (inter-cluster) connection, and the second port
(port 1 in DPDK) is used
for private (intra-cluster)
connection. To measure the latency of the
two machines, we are going to use ‘port 1’ of the
NIC.
Now that you have OFED installed, you are ready to build the DPDK library.
8. $ git clone https://github.com/DPDK/dpdk
9. $ cd dpdk
$ git checkout releases
CONFIG_RTE_LIBRTE_MLX4_PMD=y
’
in line 366 of config
/common_base
11.$ vim config/common_base (Modify line 366)
13.$ make config T=x86_64-native-linuxapp-gcc
14.$ make -j16
16.$ echo 1024 | sudo tee /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
Congratulations! Now
your DPDK library is ready to use. Before going on, you
are encouraged to compile and
play with DPDK’s sample applications on examples
folder of DPDK. Their documentation is here. When
you compile the code, don’t forget
to link dpdk
/build/lib
and
include dpdk
/build/include
.
Warning: Your DPDK configuration (and any code you
develop on the Cloudlab machines) will be deleted when the
experiment ends (~16 hours by default). Consider storing a copy of your
code on Github
(using a private repo) or on
a personal machine. CloudLab can also make a disk image of your
machine (a type of
snapshot) to save your progress in installing the packages described above. Note that your home directory will not
be saved in disk images, but you
can place data to be included in the image in /opt
.
In this assignment, you will
build a server that can respond to ICMP echos using DPDK. Since DPDK directly works with Ethernet frames, you will have
to manually parse and modify IP and ICMP headers from echo
requests. DPDK uses struct
rte_mbuf
data structure to store packet buffers. The programming guide explains this data structure here.
If you feel it
is hard to start, use the simple L2 forwarding example (examples
/skeleton
) we
discussed in the tutorial
as a starting point. You will need
to change the port initialization code since you
will only use port1 in this experiment (remember port0 is needed for
Linux and ssh). You have to modify
lcore_main
(
)
to incorporate your
packet parsing and crafting logic
into its run-to-completion
loop. You might find the following
macros and functions useful for your purpose. You can
find them in DPDK’s API documentation.
rte_pktmbuf_alloc
rte_pktmbuf_mtod
rte_cpu_to_be_16
rte_be_to_cpu_16
rte_cpu_to_be_32
rte_be_to_cpu_32
RTE_IPV4
rte_is_same_ether_addr
rte_ether_addr_copy
rte_eth_rx_burst
rte_eth_tx_burst
Hint: It may be easier to modify each received packet buffer in place before sending it, rather than creating a new packet buffer.
Hint: IP and ICMP checksums must be updated if you modify the packet. DPDK provides several functions to help calculate checksums.
Hint: See RFC 792, ICMP echo, for more details about how ping works.
Hint: Make sure your Ethernet header contains the right MAC addresses.
For simplicity, we recommend building an echo server in DPDK, and then using the standard Linux network stack on the other machine to send ICMP echo packets to it (acting as a client), rather than building a separate client in DPDK. You can generate ping packets on the client machine as follows:
$ sudo ifconfig eno1d1 192.168.1.2 netmask 255.255.255.0
$ sudo arp -s 192.168.1.3 [your server's eno1d1 MAC address]
$ sudo ping -f 192.168.1.3
If successful, your
echo server will respond to each ping, and
you’ll see ping statistics and no packet loss.
You may find
that tcpdump
is a useful tool for debugging whether the client
is receiving your server’s responses.
Now that you have a working server that can respond to ICMP ping requests, you are asked to measure the full time your software and DPDK use to process these requests. You have to figure out the correct code position to add timing API calls.
Hint: You might have to instrument part of the DPDK mlx4 driver.
For an accurate time measurement, you can read the CPU's time stamp counter (TSC) for the elapsed cycles and calculate the elapsed time as cycles/freq. Below shows a sample code of accomplishing that in DPDK.
uint64_t hz = rte_get_timer_hz();
uint64_t begin = rte_rdtsc_precise();
// Do something
uint64_t elapsed_cycles = rte_rdtsc_precise() - begin;
uint64_t microseconds= elapsed_cycles * 1000000 / hz;
If you want to learn more, this Intel white paper is a good reference. You’re free to use other methods to measure or infer your system’s performance, but make sure to describe your approach.
Part II: Try
building and using the latest
version of
Shenango to send a ping to your DPDK server. See
tests
/test_ping.c
. How does its latency
compare to Linux?
** Configurations required to build Shenango on m510 in Cloudlab
$ sudo apt-get install libnuma-dev libaio1 libaio-dev uuid-dev libcunit1 libcunit1-doc libcunit1-dev libmnl-dev
CONFIG_MLX4=y
in build/config
.
iokernel
/dpdk.c
to dp
.port = 1
.
You
should also include what is your plan
for Part III.
Part III: Propose
and modify Shenango. For example, try lthreads instead
of pthreads; propose a solution for cores to sleep;
propose utilizing multiples IOVisors.
It has to be something new.
You should submit a final report following SBRC2022 template.
It has 14 pages, single
column.
If you do a good
work, you can submit it to SBRC2022. More info at:
https://www.sbrc2021.facom.ufu.br/?page_id=75