Friday, September 22, 2006

FreeBSD Device Polling

Not all of us work with the latest, greatest hardware. If we use open source software, we often find ourselves running it on old hardware. I have a mix of equipment in my lab and I frequently see what I can do with it.

In this post I'd like to talk about some simple network performance measurement testing. Some of this is based on the book Network Performance Toolkit: Using Open Source Testing Tools. I don't presume that any of this is definitive, novel, or particularly helpful for all readers. I welcome constructive ideas for improvements.

For the purposes of this post, I'd like to get a sense of the network throughput between two hosts, asa633 and poweredge.

This is asa633's dmesg output:

FreeBSD 6.1-RELEASE-p6 #0: Wed Sep 20 20:02:56 EDT 2006
root@kbld.taosecurity.com:/usr/obj/usr/src/sys/GENERIC.SECURITY
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel Celeron (631.29-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x686 Stepping = 6
Features=0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PA
T,PSE36,MMX,FXSR,SSE>
real memory = 334233600 (318 MB)
avail memory = 317620224 (302 MB)

This is poweredge's dmesg output:

FreeBSD 6.1-RELEASE-p6 #0: Wed Sep 20 20:02:56 EDT 2006
root@kbld.taosecurity.com:/usr/obj/usr/src/sys/GENERIC.SECURITY
ACPI APIC Table: <DELL PE2300 >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Pentium III/Pentium III Xeon/Celeron (498.75-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x673 Stepping = 3
Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CM
OV,PAT,PSE36,MMX,FXSR,SSE>
real memory = 536862720 (511 MB)
avail memory = 515993600 (492 MB)

Neither system has any tuning applied.

Each box has the following relevant interfaces.

asa633:/root# ifconfig em0
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
options=b<RXCSUM,TXCSUM,VLAN_MTU>
inet6 fe80::204:23ff:feb1:64e2%em0 prefixlen 64 scopeid 0x3
inet 172.16.6.1 netmask 0xffffff00 broadcast 172.16.6.255
ether 00:04:23:b1:64:e2
media: Ethernet autoselect (1000baseSX )
status: active
asa633:/root# ifconfig em1
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
options=b<RXCSUM,TXCSUM,VLAN_MTU>
inet6 fe80::20e:cff:feba:e726%em1 prefixlen 64 scopeid 0x4
inet 172.16.7.1 netmask 0xffffff00 broadcast 172.16.7.255
ether 00:0e:0c:ba:e7:26
media: Ethernet autoselect (1000baseTX )
status: active

poweredge:/root# ifconfig em0
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
options=b<RXCSUM,TXCSUM,VLAN_MTU>
inet6 fe80::204:23ff:feab:964%em0 prefixlen 64 scopeid 0x2
inet 172.16.6.2 netmask 0xffffff00 broadcast 172.16.6.255
ether 00:04:23:ab:09:64
media: Ethernet autoselect (1000baseSX )
status: active
poweredge:/root# ifconfig em1
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
options=b<RXCSUM,TXCSUM,VLAN_MTU>
inet6 fe80::207:e9ff:fe11:a0a0%em1 prefixlen 64 scopeid 0x4
inet 172.16.7.2 netmask 0xffffff00 broadcast 172.16.7.255
ether 00:07:e9:11:a0:a0
media: Ethernet autoselect (1000baseTX )
status: active

The 172.16.6.0/24 interfaces are connected directly via fiber. The 172.16.7.0/24 interfaces are connected directly via copper.

With this setup, let's use Iperf to transmit and receive traffic.

Poweredge runs the server, but let's show the client first.

asa633:/root# iperf -c 172.16.6.2 -t 60 -i 5
------------------------------------------------------------
Client connecting to 172.16.6.2, TCP port 5001
TCP window size: 32.5 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.6.1 port 52453 connected with 172.16.6.2 port 5001
[ 3] 0.0- 5.0 sec 82.1 MBytes 138 Mbits/sec
[ 3] 5.0-10.0 sec 83.4 MBytes 140 Mbits/sec
[ 3] 10.0-15.0 sec 83.6 MBytes 140 Mbits/sec
[ 3] 15.0-20.0 sec 83.6 MBytes 140 Mbits/sec
[ 3] 20.0-25.0 sec 83.5 MBytes 140 Mbits/sec
[ 3] 25.0-30.0 sec 84.2 MBytes 141 Mbits/sec
[ 3] 30.0-35.0 sec 85.4 MBytes 143 Mbits/sec
[ 3] 35.0-40.0 sec 85.7 MBytes 144 Mbits/sec
[ 3] 40.0-45.0 sec 86.8 MBytes 146 Mbits/sec
[ 3] 45.0-50.0 sec 88.8 MBytes 149 Mbits/sec
[ 3] 50.0-55.0 sec 90.6 MBytes 152 Mbits/sec
[ 3] 55.0-60.0 sec 91.6 MBytes 154 Mbits/sec
[ 3] 0.0-60.0 sec 1.01 GBytes 144 Mbits/sec

Here is the server's view.

poweredge:/root# iperf -s -B 172.16.6.2
------------------------------------------------------------
Server listening on TCP port 5001
Binding to local address 172.16.6.2
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[ 4] local 172.16.6.2 port 5001 connected with 172.16.6.1 port 52453
[ 4] 0.0-60.0 sec 1.01 GBytes 144 Mbits/sec

That's interesting. These boxes averaged 144 Mbps. While the tests were running I captured top output. First, the client asa633:

last pid: 840; load averages: 0.24, 0.10, 0.03 up 0+01:03:10 15:48:51
27 processes: 2 running, 25 sleeping
CPU states: 2.7% user, 0.0% nice, 47.1% system, 49.4% interrupt, 0.8% idle
Mem: 8876K Active, 5784K Inact, 17M Wired, 9040K Buf, 273M Free
Swap: 640M Total, 640M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
840 root 2 102 0 2768K 1696K RUN 0:04 0.00% iperf

Now the server, poweredge.

last pid: 716; load averages: 0.36, 0.12, 0.04 up 0+00:53:13 15:49:10
34 processes: 2 running, 32 sleeping
CPU states: 2.6% user, 0.0% nice, 39.0% system, 56.9% interrupt, 1.5% idle
Mem: 31M Active, 8768K Inact, 20M Wired, 12M Buf, 434M Free
Swap: 1024M Total, 1024M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
716 root 3 106 0 2956K 1792K RUN 0:07 0.00% iperf

Those seem like high interrupt counts. Before making changes to see if we can improve the situation, let's run Iperf in bidirectional mode. That sends traffic from the client to server and server to client simultaneously.

Here is the client's view.

asa633:/root# iperf -c 172.16.6.2 -d -t 60 -i 5
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.16.6.2, TCP port 5001
TCP window size: 32.5 KByte (default)
------------------------------------------------------------
[ 5] local 172.16.6.1 port 64827 connected with 172.16.6.2 port 5001
[ 4] local 172.16.6.1 port 5001 connected with 172.16.6.2 port 61729
[ 5] 0.0- 5.0 sec 33.8 MBytes 56.8 Mbits/sec
[ 4] 0.0- 5.0 sec 52.1 MBytes 87.5 Mbits/sec
[ 5] 5.0-10.0 sec 41.2 MBytes 69.2 Mbits/sec
[ 4] 5.0-10.0 sec 44.0 MBytes 73.8 Mbits/sec
[ 5] 10.0-15.0 sec 44.2 MBytes 74.2 Mbits/sec
[ 4] 10.0-15.0 sec 43.2 MBytes 72.5 Mbits/sec
[ 5] 15.0-20.0 sec 41.7 MBytes 70.0 Mbits/sec
[ 4] 15.0-20.0 sec 46.0 MBytes 77.1 Mbits/sec
[ 4] 20.0-25.0 sec 44.5 MBytes 74.7 Mbits/sec
[ 5] 20.0-25.0 sec 43.4 MBytes 72.8 Mbits/sec
[ 5] 25.0-30.0 sec 40.7 MBytes 68.3 Mbits/sec
[ 4] 25.0-30.0 sec 47.7 MBytes 80.0 Mbits/sec
[ 5] 30.0-35.0 sec 44.4 MBytes 74.6 Mbits/sec
[ 4] 30.0-35.0 sec 44.5 MBytes 74.7 Mbits/sec
[ 5] 35.0-40.0 sec 40.7 MBytes 68.3 Mbits/sec
[ 4] 35.0-40.0 sec 48.9 MBytes 82.1 Mbits/sec
[ 5] 40.0-45.0 sec 44.3 MBytes 74.3 Mbits/sec
[ 4] 40.0-45.0 sec 45.7 MBytes 76.6 Mbits/sec
[ 4] 45.0-50.0 sec 46.8 MBytes 78.5 Mbits/sec
[ 5] 45.0-50.0 sec 43.4 MBytes 72.8 Mbits/sec
[ 5] 50.0-55.0 sec 42.6 MBytes 71.6 Mbits/sec
[ 4] 50.0-55.0 sec 48.4 MBytes 81.2 Mbits/sec
[ 5] 55.0-60.0 sec 45.3 MBytes 75.9 Mbits/sec
[ 5] 0.0-60.0 sec 506 MBytes 70.7 Mbits/sec
[ 4] 55.0-60.0 sec 46.0 MBytes 77.2 Mbits/sec
[ 4] 0.0-60.0 sec 558 MBytes 78.0 Mbits/sec

Here is the server's view.

poweredge:/root# iperf -s -B 172.16.6.2
------------------------------------------------------------
Server listening on TCP port 5001
Binding to local address 172.16.6.2
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
bind failed: Address already in use
[ 4] local 172.16.6.2 port 5001 connected with 172.16.6.1 port 64827
------------------------------------------------------------
Client connecting to 172.16.6.1, TCP port 5001
Binding to local address 172.16.6.2
TCP window size: 32.5 KByte (default)
------------------------------------------------------------
[ 6] local 172.16.6.2 port 61729 connected with 172.16.6.1 port 5001
[ 6] 0.0-60.0 sec 558 MBytes 78.0 Mbits/sec
[ 4] 0.0-60.0 sec 506 MBytes 70.7 Mbits/sec

Throughput is about half the previous, which makes sense because we are sending data in two directions.

Here is a snapshot of asa633's top output.

last pid: 868; load averages: 0.34, 0.16, 0.08 up 0+01:09:33 15:55:14
27 processes: 2 running, 25 sleeping
CPU states: 1.2% user, 0.0% nice, 43.0% system, 54.7% interrupt, 1.2% idle
Mem: 8916K Active, 5848K Inact, 18M Wired, 10M Buf, 272M Free
Swap: 640M Total, 640M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
852 root 2 101 0 3112K 1852K RUN 0:10 0.00% iperf

Here is poweredge.

last pid: 739; load averages: 0.49, 0.19, 0.10 up 0+00:59:47 15:55:44
34 processes: 2 running, 32 sleeping
CPU states: 1.9% user, 0.0% nice, 36.3% system, 61.8% interrupt, 0.0% idle
Mem: 31M Active, 8772K Inact, 20M Wired, 12M Buf, 434M Free
Swap: 1024M Total, 1024M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
722 root 3 103 0 3120K 1836K RUN 0:19 0.00% iperf
507 mysql 5 20 0 57280K 26256K kserel 0:07 0.00% mysqld

Again, high interrupts. Let's try a undirectional UDP test.

asa633:/root# iperf -c 172.16.6.2 -u -t 60 -i 5 -b 500M
------------------------------------------------------------
Client connecting to 172.16.6.2, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 9.00 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.6.1 port 61919 connected with 172.16.6.2 port 5001
[ 3] 0.0- 5.0 sec 131 MBytes 220 Mbits/sec
[ 3] 5.0-10.0 sec 132 MBytes 221 Mbits/sec
[ 3] 10.0-15.0 sec 131 MBytes 221 Mbits/sec
[ 3] 15.0-20.0 sec 131 MBytes 220 Mbits/sec
[ 3] 20.0-25.0 sec 131 MBytes 220 Mbits/sec
[ 3] 25.0-30.0 sec 131 MBytes 220 Mbits/sec
[ 3] 30.0-35.0 sec 131 MBytes 220 Mbits/sec
[ 3] 35.0-40.0 sec 132 MBytes 221 Mbits/sec
[ 3] 40.0-45.0 sec 132 MBytes 221 Mbits/sec
[ 3] 45.0-50.0 sec 132 MBytes 221 Mbits/sec
[ 3] 50.0-55.0 sec 132 MBytes 221 Mbits/sec
[ 3] 0.0-60.0 sec 1.54 GBytes 221 Mbits/sec
[ 3] Sent 1125481 datagrams
[ 3] Server Report:
[ 3] 0.0-60.3 sec 793 MBytes 110 Mbits/sec 15.711 ms 560027/1125479 (50%)
[ 3] 0.0-60.3 sec 1 datagrams received out-of-order

Here is the server's view.

poweredge:/root# iperf -s -u -B 172.16.6.2
------------------------------------------------------------
Server listening on UDP port 5001
Binding to local address 172.16.6.2
Receiving 1470 byte datagrams
UDP buffer size: 41.1 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.6.2 port 5001 connected with 172.16.6.1 port 61919
[ 3] 0.0-60.3 sec 793 MBytes 110 Mbits/sec 15.712 ms 560027/1125479 (50%)
[ 3] 0.0-60.3 sec 1 datagrams received out-of-order

Check out the interrupt levels. First, the client, which shows the iperf process working hard to generate packets.

last pid: 914; load averages: 0.64, 0.34, 0.18 up 0+01:20:43 16:06:24
27 processes: 2 running, 25 sleeping
CPU states: 5.4% user, 0.0% nice, 75.5% system, 19.1% interrupt, 0.0% idle
Mem: 8956K Active, 6288K Inact, 18M Wired, 10M Buf, 271M Free
Swap: 640M Total, 640M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
914 root 2 20 0 2764K 1752K ksesig 0:16 80.33% iperf

On the server, however, the interrupt level. Packets are being lost, as we saw in the server report earlier.

last pid: 767; load averages: 0.79, 0.42, 0.21 up 0+01:10:51 16:06:48
34 processes: 2 running, 32 sleeping
CPU states: 4.1% user, 0.0% nice, 35.2% system, 60.3% interrupt, 0.4% idle
Mem: 31M Active, 8776K Inact, 20M Wired, 12M Buf, 434M Free
Swap: 1024M Total, 1024M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
767 root 3 110 0 2944K 1760K RUN 0:17 0.00% iperf

Let's see if device polling improves any of these numbers.

Using the technique explained here, I create this kernel:

kbld:/root# cat /usr/src/sys/i386/conf/GENERIC.POLLING
include GENERIC
options DEVICE_POLLING

Now I boot asa633 and poweredge using that kernel.

FreeBSD 6.1-RELEASE-p6 #0: Sun Sep 17 17:09:24 EDT 2006
root@kbld.taosecurity.com:/usr/obj/usr/src/sys/GENERIC.POLLING

Enabling polling gives access to a set of new sysctl knobs.

asa633:/root# sysctl -a | grep poll
kern.polling.burst: 5
kern.polling.burst_max: 150
kern.polling.each_burst: 5
kern.polling.idle_poll: 0
kern.polling.user_frac: 50
kern.polling.reg_frac: 20
kern.polling.short_ticks: 0
kern.polling.lost_polls: 0
kern.polling.pending_polls: 0
kern.polling.residual_burst: 0
kern.polling.handlers: 0
kern.polling.enable: 0
kern.polling.phase: 0
kern.polling.suspect: 0
kern.polling.stalled: 0
kern.polling.idlepoll_sleeping: 1
hw.nve_pollinterval: 0

You don't need to change the value of kern.polling.enable. In fact, doing so generates an error, e.g. kern.polling.enable is deprecated. Use ifconfig(8).

Instead, use ifconfig polling.

asa633:/root# ifconfig em0 polling
asa633:/root# ifconfig em0
em0: flags=8843 mtu 1500
options=4b
inet6 fe80::204:23ff:feb1:64e2%em0 prefixlen 64 scopeid 0x3
inet 172.16.6.1 netmask 0xffffff00 broadcast 172.16.6.255
ether 00:04:23:b1:64:e2
media: Ethernet autoselect (1000baseSX )
status: active

I enable polling on both boxes em0 interfaces.

Here are the test results. First, the client.

asa633:/root# iperf -c 172.16.6.2 -t 60 -i 5
------------------------------------------------------------
Client connecting to 172.16.6.2, TCP port 5001
TCP window size: 32.5 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.6.1 port 62829 connected with 172.16.6.2 port 5001
[ 3] 0.0- 5.0 sec 90.5 MBytes 152 Mbits/sec
[ 3] 5.0-10.0 sec 128 MBytes 214 Mbits/sec
[ 3] 10.0-15.0 sec 125 MBytes 209 Mbits/sec
[ 3] 15.0-20.0 sec 105 MBytes 176 Mbits/sec
[ 3] 20.0-25.0 sec 83.7 MBytes 140 Mbits/sec
[ 3] 25.0-30.0 sec 76.7 MBytes 129 Mbits/sec
[ 3] 30.0-35.0 sec 78.1 MBytes 131 Mbits/sec
[ 3] 35.0-40.0 sec 121 MBytes 203 Mbits/sec
[ 3] 40.0-45.0 sec 126 MBytes 212 Mbits/sec
[ 3] 45.0-50.0 sec 115 MBytes 192 Mbits/sec
[ 3] 50.0-55.0 sec 91.9 MBytes 154 Mbits/sec
[ 3] 55.0-60.0 sec 77.9 MBytes 131 Mbits/sec
[ 3] 0.0-60.0 sec 1.19 GBytes 170 Mbits/sec

The server:

poweredge:/root# iperf -s -B 172.16.6.2
------------------------------------------------------------
Server listening on TCP port 5001
Binding to local address 172.16.6.2
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[ 4] local 172.16.6.2 port 5001 connected with 172.16.6.1 port 62829
[ 4] 0.0-60.0 sec 1.19 GBytes 170 Mbits/sec

Compare that to the previous test result without device polling.

[ 4] 0.0-60.0 sec 1.01 GBytes 144 Mbits/sec

We get better throughput here, but not the amazing improvement we'll see with UDP (later).

The interrupt counts are much better. Here's the client.

last pid: 693; load averages: 0.22, 0.07, 0.05 up 0+00:12:48 16:23:54
27 processes: 2 running, 25 sleeping
CPU states: 7.4% user, 0.0% nice, 59.1% system, 32.7% interrupt, 0.8% idle
Mem: 8928K Active, 5680K Inact, 17M Wired, 8928K Buf, 273M Free
Swap: 640M Total, 640M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
693 root 2 105 0 2768K 1684K RUN 0:08 0.00% iperf

Check out the server!

last pid: 633; load averages: 0.40, 0.16, 0.10 up 0+00:12:33 16:24:20
34 processes: 2 running, 32 sleeping
CPU states: 1.1% user, 0.0% nice, 27.0% system, 0.4% interrupt, 71.5% idle
Mem: 31M Active, 9168K Inact, 21M Wired, 13M Buf, 433M Free
Swap: 1024M Total, 1024M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
632 root 3 103 0 2956K 1824K RUN 0:12 0.00% iperf

That's amazing.

Let's try a dual test. The client:

asa633:/root# iperf -c 172.16.6.2 -d -t 60 -i 5
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 172.16.6.2, TCP port 5001
TCP window size: 32.5 KByte (default)
------------------------------------------------------------
[ 5] local 172.16.6.1 port 58192 connected with 172.16.6.2 port 5001
[ 4] local 172.16.6.1 port 5001 connected with 172.16.6.2 port 57738
[ 5] 0.0- 5.0 sec 54.8 MBytes 91.9 Mbits/sec
[ 4] 0.0- 5.0 sec 75.3 MBytes 126 Mbits/sec
[ 5] 5.0-10.0 sec 56.5 MBytes 94.8 Mbits/sec
[ 4] 5.0-10.0 sec 75.3 MBytes 126 Mbits/sec
[ 5] 10.0-15.0 sec 55.7 MBytes 93.4 Mbits/sec
[ 4] 10.0-15.0 sec 76.3 MBytes 128 Mbits/sec
[ 5] 15.0-20.0 sec 48.0 MBytes 80.5 Mbits/sec
[ 4] 15.0-20.0 sec 85.4 MBytes 143 Mbits/sec
[ 5] 20.0-25.0 sec 43.7 MBytes 73.3 Mbits/sec
[ 4] 20.0-25.0 sec 89.1 MBytes 150 Mbits/sec
[ 5] 25.0-30.0 sec 46.3 MBytes 77.7 Mbits/sec
[ 4] 25.0-30.0 sec 83.6 MBytes 140 Mbits/sec
[ 5] 30.0-35.0 sec 50.7 MBytes 85.1 Mbits/sec
[ 4] 30.0-35.0 sec 80.8 MBytes 136 Mbits/sec
[ 5] 35.0-40.0 sec 56.1 MBytes 94.2 Mbits/sec
[ 4] 35.0-40.0 sec 75.1 MBytes 126 Mbits/sec
[ 5] 40.0-45.0 sec 56.1 MBytes 94.2 Mbits/sec
[ 4] 40.0-45.0 sec 76.4 MBytes 128 Mbits/sec
[ 5] 45.0-50.0 sec 48.9 MBytes 82.0 Mbits/sec
[ 4] 45.0-50.0 sec 84.4 MBytes 142 Mbits/sec
[ 5] 50.0-55.0 sec 43.9 MBytes 73.6 Mbits/sec
[ 4] 50.0-55.0 sec 91.0 MBytes 153 Mbits/sec
[ 4] 0.0-60.0 sec 979 MBytes 137 Mbits/sec
[ 5] 55.0-60.0 sec 44.6 MBytes 74.8 Mbits/sec
[ 5] 0.0-60.0 sec 605 MBytes 84.6 Mbits/sec

The server:

poweredge:/root# iperf -s -B 172.16.6.2
------------------------------------------------------------
Server listening on TCP port 5001
Binding to local address 172.16.6.2
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
bind failed: Address already in use
[ 4] local 172.16.6.2 port 5001 connected with 172.16.6.1 port 58192
------------------------------------------------------------
Client connecting to 172.16.6.1, TCP port 5001
Binding to local address 172.16.6.2
TCP window size: 32.5 KByte (default)
------------------------------------------------------------
[ 6] local 172.16.6.2 port 57738 connected with 172.16.6.1 port 5001
[ 6] 0.0-60.0 sec 979 MBytes 137 Mbits/sec
[ 4] 0.0-60.0 sec 605 MBytes 84.6 Mbits/sec

Compare those results with their non-device polling counterparts.

[ 6] 0.0-60.0 sec 558 MBytes 78.0 Mbits/sec
[ 4] 0.0-60.0 sec 506 MBytes 70.7 Mbits/sec

Here's the client top excerpt:

last pid: 697; load averages: 0.21, 0.15, 0.09 up 0+00:16:02 16:27:08
27 processes: 2 running, 25 sleeping
CPU states: 5.1% user, 0.0% nice, 54.1% system, 40.9% interrupt, 0.0% idle
Mem: 9012K Active, 5688K Inact, 17M Wired, 8928K Buf, 272M Free
Swap: 640M Total, 640M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
697 root 2 108 0 3112K 1852K RUN 0:08 0.00% iperf

Server top excerpt:

last pid: 637; load averages: 0.43, 0.21, 0.12 up 0+00:15:44 16:27:31
34 processes: 2 running, 32 sleeping
CPU states: 5.2% user, 0.0% nice, 56.3% system, 6.7% interrupt, 31.7% idle
Mem: 31M Active, 9168K Inact, 21M Wired, 13M Buf, 433M Free
Swap: 1024M Total, 1024M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
637 root 3 105 0 3120K 1868K RUN 0:15 0.00% iperf



Let's try the UDP tests again with device polling enabled. Here's the client side.

asa633:/root# iperf -c 172.16.6.2 -u -t 60 -i 5 -b 500M
------------------------------------------------------------
Client connecting to 172.16.6.2, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 9.00 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.6.1 port 54045 connected with 172.16.6.2 port 5001
[ 3] 0.0- 5.0 sec 172 MBytes 289 Mbits/sec
[ 3] 5.0-10.0 sec 173 MBytes 290 Mbits/sec
[ 3] 10.0-15.0 sec 173 MBytes 290 Mbits/sec
[ 3] 15.0-20.0 sec 173 MBytes 290 Mbits/sec
[ 3] 20.0-25.0 sec 173 MBytes 290 Mbits/sec
[ 3] 25.0-30.0 sec 174 MBytes 291 Mbits/sec
[ 3] 30.0-35.0 sec 173 MBytes 291 Mbits/sec
[ 3] 35.0-40.0 sec 173 MBytes 291 Mbits/sec
[ 3] 40.0-45.0 sec 173 MBytes 291 Mbits/sec
[ 3] 45.0-50.0 sec 173 MBytes 291 Mbits/sec
[ 3] 50.0-55.0 sec 170 MBytes 284 Mbits/sec
[ 3] 0.0-60.0 sec 2.02 GBytes 290 Mbits/sec
[ 3] Sent 1478220 datagrams
[ 3] Server Report:
[ 3] 0.0-60.0 sec 1.94 GBytes 277 Mbits/sec 0.056 ms 62312/1478219 (4.2%)
[ 3] 0.0-60.0 sec 1 datagrams received out-of-order

Here's the server side.

poweredge:/root# iperf -s -u -B 172.16.6.2
------------------------------------------------------------
Server listening on UDP port 5001
Binding to local address 172.16.6.2
Receiving 1470 byte datagrams
UDP buffer size: 41.1 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.6.2 port 5001 connected with 172.16.6.1 port 54045
[ 3] 0.0-60.0 sec 1.94 GBytes 277 Mbits/sec 0.056 ms 62312/1478219 (4.2%)
[ 3] 0.0-60.0 sec 1 datagrams received out-of-order

Compare that to the results from the test without device polling.

[ 3] 0.0-60.3 sec 793 MBytes 110 Mbits/sec 15.712 ms 560027/1125479 (50%)

Because so few packets were dropped, throughput was much higher for UDP.

Here's the client top excerpt:

last pid: 705; load averages: 1.16, 0.59, 0.29 up 0+00:21:27 16:32:33
27 processes: 2 running, 25 sleeping
CPU states: 9.0% user, 0.0% nice, 80.5% system, 10.5% interrupt, 0.0% idle
Mem: 8952K Active, 5684K Inact, 17M Wired, 8928K Buf, 273M Free
Swap: 640M Total, 640M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
705 root 2 20 0 2764K 1752K ksesig 0:24 89.08% iperf

Server top excerpt:

last pid: 650; load averages: 0.48, 0.25, 0.16 up 0+00:21:01 16:32:48
34 processes: 2 running, 32 sleeping
CPU states: 7.9% user, 0.0% nice, 88.8% system, 0.4% interrupt, 3.0% idle
Mem: 31M Active, 9168K Inact, 21M Wired, 13M Buf, 433M Free
Swap: 1024M Total, 1024M Free

PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
650 root 3 120 0 2944K 1792K RUN 0:25 0.00% iperf

The incredibly low interrupt count explains why so fewer packets were dropped.

The only downside to device polling may be that your NIC might not support it. Check man 4 polling. This is one of the reasons I like to use Intel NICs -- they are bound to be supported and they perform well.

No comments: