Saturday, September 23, 2006

Throughput Testing Through a Bridge

In my earlier posts I've discussed throughput testing. Now I'm going to introduce an inline system as a bridge. You could imagine that this system might be a firewall, or run Snort in inline mode. For the purposes of this post, however, we're just going to see what effect the bridge has on throughput between a client and server.

This is the new system. It's called cel600, and it's running the same GENERIC.POLLING kernel mentioned earlier.

FreeBSD 6.1-RELEASE-p6 #0: Sun Sep 17 17:09:24 EDT 2006
root@kbld.taosecurity.com:/usr/obj/usr/src/sys/GENERIC.POLLING
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel Celeron (598.19-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x686 Stepping = 6
Features=0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PA
T,PSE36,MMX,FXSR,SSE>
real memory = 401260544 (382 MB)
avail memory = 383201280 (365 MB)

This system has two dual NICs in it. em0 and em1 are Gigabit fiber, and em2 and em3 are Gigabit copper.

cel600:/root# ifconfig em0
em0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
options=48<VLAN_MTU,POLLING>
inet6 fe80::204:23ff:feb1:7f22%em0 prefixlen 64 scopeid 0x1
ether 00:04:23:b1:7f:22
media: Ethernet autoselect (1000baseSX <full-duplex>)
status: active
cel600:/root# ifconfig em1
em1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
options=48<VLAN_MTU,POLLING>
inet6 fe80::204:23ff:feb1:7f23%em1 prefixlen 64 scopeid 0x2
ether 00:04:23:b1:7f:23
media: Ethernet autoselect (1000baseSX <full-duplex>)
status: active
cel600:/root# ifconfig em2
em2: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
options=48<VLAN_MTU,POLLING>
inet6 fe80::204:23ff:fec5:4e80%em2 prefixlen 64 scopeid 0x3
ether 00:04:23:c5:4e:80
media: Ethernet autoselect (1000baseTX <full-duplex>)
status: active
cel600:/root# ifconfig em3
em3: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
options=48<VLAN_MTU,POLLING>
inet6 fe80::204:23ff:fec5:4e81%em3 prefixlen 64 scopeid 0x4
ether 00:04:23:c5:4e:81
media: Ethernet autoselect (1000baseTX <full-duplex>)
status: active

I configure them in /etc/rc.conf this way:

ifconfig_em0="polling up"
ifconfig_em1="polling up"
ifconfig_em2="polling up"
ifconfig_em3="polling up"
cloned_interfaces="bridge0 bridge1"
ifconfig_bridge0="addm em0 addm em1 monitor up"
ifconfig_bridge1="addm em2 addm em3 monitor up"

The end result is two bridge interfaces.

bridge0: flags=48043<UP,BROADCAST,RUNNING,MULTICAST,MONITOR> mtu 1500
ether ac:de:48:e5:e7:69
priority 32768 hellotime 2 fwddelay 15 maxage 20
member: em1 flags=3<LEARNING,DISCOVER>
member: em0 flags=3<LEARNING,DISCOVER>
cel600:/root# ifconfig bridge1
bridge1: flags=48043>UP,BROADCAST,RUNNING,MULTICAST,MONITOR> mtu 1500
ether ac:de:48:0c:26:66
priority 32768 hellotime 2 fwddelay 15 maxage 20
member: em3 flags=3<LEARNING,DISCOVER>
member: em2 flags=3<LEARNING,DISCOVER>

Notice these two pseudo-interfaces are both in MONITOR mode. That was set automatically.

With the bridge in place, I can conduct throughput tests.

Here is the client's view.

asa633:/root# iperf -c 172.16.6.2 -t 60 -i 5
------------------------------------------------------------
Client connecting to 172.16.6.2, TCP port 5001
TCP window size: 32.5 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.6.1 port 57355 connected with 172.16.6.2 port 5001
[ 3] 0.0- 5.0 sec 55.9 MBytes 93.9 Mbits/sec
[ 3] 5.0-10.0 sec 51.6 MBytes 86.6 Mbits/sec
[ 3] 10.0-15.0 sec 72.3 MBytes 121 Mbits/sec
[ 3] 15.0-20.0 sec 54.6 MBytes 91.6 Mbits/sec
[ 3] 20.0-25.0 sec 61.4 MBytes 103 Mbits/sec
[ 3] 25.0-30.0 sec 75.4 MBytes 127 Mbits/sec
[ 3] 30.0-35.0 sec 60.2 MBytes 101 Mbits/sec
[ 3] 35.0-40.0 sec 47.8 MBytes 80.2 Mbits/sec
[ 3] 40.0-45.0 sec 74.7 MBytes 125 Mbits/sec
[ 3] 45.0-50.0 sec 59.0 MBytes 99.0 Mbits/sec
[ 3] 50.0-55.0 sec 54.0 MBytes 90.6 Mbits/sec
[ 3] 55.0-60.0 sec 76.8 MBytes 129 Mbits/sec
[ 3] 0.0-60.0 sec 744 MBytes 104 Mbits/sec

Here is the server's view.

poweredge:/root# iperf -s -B 172.16.6.2
------------------------------------------------------------
Server listening on TCP port 5001
Binding to local address 172.16.6.2
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[ 4] local 172.16.6.2 port 5001 connected with 172.16.6.1 port 57355
[ 4] 0.0-60.0 sec 744 MBytes 104 Mbits/sec

Compared to the straight-through tests, you can see the effect on throughput caused by the bridge.

[ 4] 0.0-60.0 sec 1.19 GBytes 170 Mbits/sec

Of interest during the test is the interrupt count on the bridge.

last pid: 728; load averages: 0.00, 0.09, 0.06 up 0+00:06:36 17:58:40
22 processes: 1 running, 21 sleeping
CPU states: 0.4% user, 0.0% nice, 0.4% system, 17.1% interrupt, 82.1% idle
Mem: 7572K Active, 4776K Inact, 16M Wired, 8912K Buf, 339M Free
Swap: 768M Total, 768M Free

Let's try the UDP test. First, the client view.

asa633:/root# iperf -c 172.16.6.2 -u -t 60 -i 5 -b 500M
------------------------------------------------------------
Client connecting to 172.16.6.2, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 9.00 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.6.1 port 51356 connected with 172.16.6.2 port 5001
[ 3] 0.0- 5.0 sec 169 MBytes 284 Mbits/sec
[ 3] 5.0-10.0 sec 169 MBytes 284 Mbits/sec
[ 3] 10.0-15.0 sec 171 MBytes 287 Mbits/sec
[ 3] 15.0-20.0 sec 171 MBytes 287 Mbits/sec
[ 3] 20.0-25.0 sec 171 MBytes 287 Mbits/sec
[ 3] 25.0-30.0 sec 171 MBytes 287 Mbits/sec
[ 3] 30.0-35.0 sec 171 MBytes 287 Mbits/sec
[ 3] 35.0-40.0 sec 172 MBytes 288 Mbits/sec
[ 3] 40.0-45.0 sec 172 MBytes 288 Mbits/sec
[ 3] 45.0-50.0 sec 172 MBytes 288 Mbits/sec
[ 3] 50.0-55.0 sec 172 MBytes 288 Mbits/sec
[ 3] 0.0-60.0 sec 2.00 GBytes 287 Mbits/sec
[ 3] Sent 1463703 datagrams
[ 3] Server Report:
[ 3] 0.0-60.0 sec 1.93 GBytes 276 Mbits/sec 0.014 ms 53386/1463702 (3.6%)
[ 3] 0.0-60.0 sec 1 datagrams received out-of-order

Now the server view.

poweredge:/root# iperf -s -u -B 172.16.6.2
------------------------------------------------------------
Server listening on UDP port 5001
Binding to local address 172.16.6.2
Receiving 1470 byte datagrams
UDP buffer size: 41.1 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.6.2 port 5001 connected with 172.16.6.1 port 51356
[ 3] 0.0-60.0 sec 1.93 GBytes 276 Mbits/sec 0.014 ms 53386/1463702 (3.6%)
[ 3] 0.0-60.0 sec 1 datagrams received out-of-order

Here's the result from the straight-through test.

[ 3] 0.0-60.0 sec 1.94 GBytes 277 Mbits/sec 0.056 ms 62312/1478219 (4.2%)

The results are almost identical.

Here is the bridge's interrupt count as shown in a top excerpt.

last pid: 751; load averages: 0.00, 0.03, 0.04 up 0+00:10:20 18:02:24
22 processes: 1 running, 21 sleeping
CPU states: 0.0% user, 0.0% nice, 0.4% system, 19.8% interrupt, 79.8% idle
Mem: 7564K Active, 4788K Inact, 16M Wired, 8928K Buf, 339M Free
Swap: 768M Total, 768M Free

With the Gigabit fiber tests done, let's look at Gigabit copper.

First, a TCP test as seen by the client.

asa633:/root# iperf -c 172.16.7.2 -t 60 -i 5
------------------------------------------------------------
Client connecting to 172.16.7.2, TCP port 5001
TCP window size: 32.5 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.7.1 port 58824 connected with 172.16.7.2 port 5001
[ 3] 0.0- 5.0 sec 76.3 MBytes 128 Mbits/sec
[ 3] 5.0-10.0 sec 76.3 MBytes 128 Mbits/sec
[ 3] 10.0-15.0 sec 76.8 MBytes 129 Mbits/sec
[ 3] 15.0-20.0 sec 76.6 MBytes 129 Mbits/sec
[ 3] 20.0-25.0 sec 76.8 MBytes 129 Mbits/sec
[ 3] 25.0-30.0 sec 75.4 MBytes 127 Mbits/sec
[ 3] 30.0-35.0 sec 76.3 MBytes 128 Mbits/sec
[ 3] 35.0-40.0 sec 76.1 MBytes 128 Mbits/sec
[ 3] 40.0-45.0 sec 76.5 MBytes 128 Mbits/sec
[ 3] 45.0-50.0 sec 75.4 MBytes 126 Mbits/sec
[ 3] 50.0-55.0 sec 76.7 MBytes 129 Mbits/sec
[ 3] 55.0-60.0 sec 76.4 MBytes 128 Mbits/sec
[ 3] 0.0-60.0 sec 916 MBytes 128 Mbits/sec

Here is the server's view.

poweredge:/root# iperf -s -B 172.16.7.2
------------------------------------------------------------
Server listening on TCP port 5001
Binding to local address 172.16.7.2
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[ 4] local 172.16.7.2 port 5001 connected with 172.16.7.1 port 58824
[ 4] 0.0-60.0 sec 916 MBytes 128 Mbits/sec

That is better than the result for fiber from above.

[ 4] 0.0-60.0 sec 744 MBytes 104 Mbits/sec

It's not as good as the result for straight-through copper.

[ 4] 0.0-60.0 sec 1.16 GBytes 166 Mbits/sec

It seemed as though the bridge interrupt count was lower than the fiber TCP tests.

last pid: 754; load averages: 0.00, 0.01, 0.02 up 0+00:13:48 18:05:52
22 processes: 1 running, 21 sleeping
CPU states: 0.0% user, 0.0% nice, 0.4% system, 16.7% interrupt, 82.9% idle
Mem: 7560K Active, 4792K Inact, 16M Wired, 8928K Buf, 339M Free
Swap: 768M Total, 768M Free

Finally, UDP copper tests. Here is the client view.

asa633:/root# iperf -c 172.16.7.2 -u -t 60 -i 5 -b 500M
------------------------------------------------------------
Client connecting to 172.16.7.2, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 9.00 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.7.1 port 62131 connected with 172.16.7.2 port 5001
[ 3] 0.0- 5.0 sec 129 MBytes 217 Mbits/sec
[ 3] 5.0-10.0 sec 129 MBytes 217 Mbits/sec
[ 3] 10.0-15.0 sec 129 MBytes 217 Mbits/sec
[ 3] 15.0-20.0 sec 129 MBytes 217 Mbits/sec
[ 3] 20.0-25.0 sec 129 MBytes 217 Mbits/sec
[ 3] 25.0-30.0 sec 129 MBytes 216 Mbits/sec
[ 3] 30.0-35.0 sec 129 MBytes 216 Mbits/sec
[ 3] 35.0-40.0 sec 129 MBytes 216 Mbits/sec
[ 3] 40.0-45.0 sec 129 MBytes 216 Mbits/sec
[ 3] 45.0-50.0 sec 129 MBytes 216 Mbits/sec
[ 3] 50.0-55.0 sec 129 MBytes 216 Mbits/sec
[ 3] 0.0-60.0 sec 1.51 GBytes 216 Mbits/sec
[ 3] Sent 1103828 datagrams
[ 3] Server Report:
[ 3] 0.0-60.0 sec 1.46 GBytes 209 Mbits/sec 0.047 ms 35057/1103827 (3.2%)
[ 3] 0.0-60.0 sec 1 datagrams received out-of-order

Here is the server view.

poweredge:/root# iperf -s -u -B 172.16.7.2
------------------------------------------------------------
Server listening on UDP port 5001
Binding to local address 172.16.7.2
Receiving 1470 byte datagrams
UDP buffer size: 41.1 KByte (default)
------------------------------------------------------------
[ 3] local 172.16.7.2 port 5001 connected with 172.16.7.1 port 62131
[ 3] 0.0-60.0 sec 1.46 GBytes 209 Mbits/sec 0.047 ms 35057/1103827 (3.2%)
[ 3] 0.0-60.0 sec 1 datagrams received out-of-order

Let's compare that to the fiber UDP test from above.

[ 3] 0.0-60.0 sec 1.93 GBytes 276 Mbits/sec 0.014 ms 53386/1463702 (3.6%)

This time, the results are much worse than the UDP over fiber results.

When I tested UDP over crossover copper, this was the result.

[ 3] 0.0-60.0 sec 1.86 GBytes 267 Mbits/sec 0.024 ms 40962/1401730 (2.9%)

The top excerpt is about the same as the fiber UDP test.

last pid: 754; load averages: 0.01, 0.01, 0.01 up 0+00:16:21 18:08:25
22 processes: 1 running, 21 sleeping
CPU states: 0.0% user, 0.0% nice, 0.0% system, 17.1% interrupt, 82.9% idle
Mem: 7564K Active, 4788K Inact, 16M Wired, 8928K Buf, 339M Free
Swap: 768M Total, 768M Free

It's not really feasible to make any solid assumptions based on these tests. They're basically get to get a ballpark feel for the capabilities of a given architecture, but you need to repeat them multiple times to get some confidence in the results.

If you want built-in repeatability and confidence testing, try Netperf.

With these results, however, I have some idea of what I can expect from this particular hardware setup, namely a bridge between a client sending data to a server.

  • TCP over fiber: about 104 Mbps

  • UDP over fiber: about 276 Mbps

  • TCP over copper: about 128 Mbps

  • UDP over copper: about 209 Mbps


Rounding down, and acting conservatively, I would feel this setup could handle somewhere around 100 Mbps (aggregated) over fiber and around 125 Mbps over copper. Note this says nothing about any software running on the bridge and its ability to do whatever function it is designed to perform. This is just a throughput estimate.

In my next related posts I'll introduce bypass switches and see how they influence this process.

I'll also rework the configuration into straight-through, bridged, and switched modes to test latency using ping.

3 comments:

Shirkdog said...

I tried to send traffic as fast as possible with straight through GIG copper from an OpenBSD box to a Linux box, the fastest I could send was 350Mb.

Just like you have mentioned, your hardware is the main bottle neck, and this was a horribly configured install on a Dual Xeon box. I am going
to try to setup a FreeBSD box and see if I can do the configuration (as was told to me), with a ramdrive and get gig speed :-)

Carlos said...

I Richard.- I'm reading your book "The TAO of Network Security Monitoring", it's an excellent book, it open my mind about monitoring. Now I'm in chapter 9, I'll give you my comments when I finish all the book.
I have 2 question about this blog...
First, I use FreeBSD firewalls with IPFW(not in bridge). You recommend the use of "polling" on all the interfaces (inside, outside, DMZ)?
Second, it's good idea to add another NIC to this firewall and use it to capture all data like you say in your book? I also run squid to cache content and some rules.

REGARDS

Richard Bejtlich said...

Hi Carlos,

I recommend using polling if you need the extra performance it provides. Otherwise there's no need.

I don't recommend capturing data on a firewall. Use a separate sensor for monitoring duties. In an emergency, you might collect on the firewall.