Troubleshooting NSM Virtualization Problems with Linux and VirtualBox

I spent a chunk of the day troubleshooting a network security monitoring (NSM) problem. I thought I would share the problem and my investigation in the hopes that it might help others. The specifics are probably less important than the general approach.

It began with ja3. You may know ja3 as a set of Zeek scripts developed by the Salesforce engineering team to profile client and server TLS parameters.

I was reviewing Zeek logs captured by my Corelight appliance and by one of my lab sensors running Security Onion. I had coverage of the same endpoint in both sensors.

I noticed that the SO Zeek logs did not have ja3 hashes in the ssl.log entries. Both sensors did have ja3s hashes. My first thought was that SO was misconfigured somehow to not record ja3 hashes. I quickly dismissed that, because it made no sense. Besides, verifying that intution required me to start troubleshooting near the top of the software stack.

I decided to start at the bottom, or close to the bottom. I had a sinking suspicion that, for some reason, Zeek was only seeing traffic sent from remote systems, and not traffic originating from my network. That would account for the creation of ja3s hashes, for traffic sent by remote systems, but not ja3 hashes, as Zeek was not seeing traffic sent by local clients.

I was running SO in VirtualBox 6.0.4 on Ubuntu 18.04. I started sniffing TCP network traffic on the SO monitoring interface using Tcpdump. As I feared, it didn't look right. I ran a new capture with filters for ICMP and a remote IP address. On another system I tried pinging the remote IP address. Sure enough, I only saw ICMP echo replies, and no ICMP echoes. Oddly, I also saw doubles and triples of some of the ICMP echo replies. That worried me, because unpredictable behavior like that could indicate some sort of software problem.

My next step was to "get under" the VM guest and determine if the VM host could see traffic properly. I ran Tcpdump on the Ubuntu 18.04 host on the monitoring interface and repeated my ICMP tests. It saw everything properly. That meant I did not need to bother checking the switch span port that was feeding traffic to the VirtualBox system.

It seemed I had a problem somewhere between the VM host and guest. On the same VM host I was also running an instance of RockNSM. I ran my ICMP tests on the RockNSM VM and, sadly, I got the same one-sided traffic as seen on SO.

Now I was worried. If the problem had only been present in SO, then I could fix SO. If the problem is present in both SO and RockNSM, then the problem had to be with VirtualBox -- and I might not be able to fix it.

I reviewed my configurations in VirtualBox, ensuring that the "Promiscuous Mode" under the Advanced options was set to "Allow All". At this point I worried that there was a bug in VirtualBox. I did some Google searches and reviewed some forum posts, but I did not see anyone reporting issues with sniffing traffic inside VMs. Still, my use case might have been weird enough to not have been reported.

I decided to try a different approach. I wondered if running VirtualBox with elevated privileges might make a difference. I did not want to take ownership of my user VMs, so I decided to install a new VM and run it with elevated privileges.

Let me stop here to note that I am breaking one of the rules of troubleshooting. I'm introducing two new variables, when I should have introduced only one. I should have built a new VM but run it with the same user privileges with which I was running the existing VMs.

I decided to install a minimal edition of Ubuntu 9, with VirtualBox running via sudo. When I started the VM and sniffed traffic on the monitoring port, lo and behold, my ICMP tests revealed both sides of the traffic as I had hoped. Unfortunately, from this I erroneously concluded that running VirtualBox with elevated privileges was the answer to my problems.

I took ownership of the SO VM in my elevated VirtualBox session, started it, and performed my ICMP tests. Womp womp. Still broken.

I realized I needed to separate the two variables that I had entangled, so I stopped VirtualBox, and changed ownership of the Debian 9 VM to my user account. I then ran VirtualBox with user privileges, started the Debian 9 VM, and ran my ICMP tests. Success again! Apparently elevated privileges had nothing to do with my problem.

By now I was glad I had not posted anything to any user forums describing my problem and asking for help. There was something about the monitoring interface configurations in both SO and RockNSM that resulted in the inability to see both sides of traffic (and avoid weird doubles and triples).

I started my SO VM again and looked at the script that configured the interfaces. I commented out all the entries below the management interface as shown below.

$ cat /etc/network/interfaces

# This configuration was created by the Security Onion setup script.
#
# The original network interface configuration file was backed up to:
# /etc/network/interfaces.bak.
#
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# loopback network interface
auto lo
iface lo inet loopback

# Management network interface
auto enp0s3
iface enp0s3 inet static
  address 192.168.40.76
  gateway 192.168.40.1
  netmask 255.255.255.0
  dns-nameservers 192.168.40.1
  dns-domain localdomain

#auto enp0s8
#iface enp0s8 inet manual
#  up ip link set $IFACE promisc on arp off up
#  down ip link set $IFACE promisc off down
#  post-up ethtool -G $IFACE rx 4096; for i in rx tx sg tso ufo gso gro lro; do ethtool -K $IFACE $i off; done
#  post-up echo 1 > /proc/sys/net/ipv6/conf/$IFACE/disable_ipv6

#auto enp0s9
#iface enp0s9 inet manual
#  up ip link set $IFACE promisc on arp off up
#  down ip link set $IFACE promisc off down
#  post-up ethtool -G $IFACE rx 4096; for i in rx tx sg tso ufo gso gro lro; do ethtool -K $IFACE $i off; done
#  post-up echo 1 > /proc/sys/net/ipv6/conf/$IFACE/disable_ipv6

I rebooted the system and brought the enp0s8 interface up manually using this command:

$ sudo ip link set enp0s8 promisc on arp off up

Fingers crossed, I ran my ICMP sniffing tests, and voila, I saw what I needed -- traffic in both directions, without doubles or triples no less.

So, there appears to be some sort of problem with the way SO and RockNSM set parameters for their monitoring interfaces, at least as far as they interact with VirtualBox 6.0.4 on Ubuntu 18.04. You can see in the network script that SO disables a bunch of NIC options. I imagine one or more of them is the culprit, but I didn't have time to work through them individually.

I tried taking a look at the network script in RockNSM, but it runs CentOS, and I'll be darned if I can't figure out where to look. I'm sure it's there somewhere, but I didn't have the time to figure out where.

The moral of the story is that I should have immediately checked after installation that both SO and RockNSM were seeing both sides of the traffic I expected them to see. I had taken that for granted for many previous deployments, but something broke recently and I don't know exactly what. My workaround will hopefully hold for now, but I need to take a closer look at the NIC options because I may have introduced another fault.

A second moral is to be careful of changing two or more variables when troubleshooting. When you do that you might fix a problem, but not know what change fixed the issue.

Comments

Derek said…
Richard,

RockNSM does indeed tune the network interface. It happens in `/usr/sbin/ifup-local`. I'd like to see if we can be a bit smarter in the future to avoid missing traffic in vbox. Can you tell me a bit more about the situation? I assume you have the VM running on a NIC in bridge mode? Can you share what driver the VM is using to access the NIC? You can get that information using `ethtool -i ethX`. Actually, if you could share the full output of that command, it could offer some clues as to what's going on here.

Thanks!

- Derek (aka dcode)
[rn23@simplerockbuild ~]$ sudo tcpdump -n -i enp0s8 icmp and host 1.1.1.1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp0s8, link-type EN10MB (Ethernet), capture size 262144 bytes
14:00:32.886702 IP 1.1.1.1 > 192.168.40.37: ICMP echo reply, id 20777, seq 1, length 64
14:00:32.901158 IP truncated-ip - 34 bytes missing! 1.1.1.1 > 192.168.40.37: ICMP echo reply, id 20777, seq 1, length 64
14:00:32.949939 IP truncated-ip - 21 bytes missing! 1.1.1.1 > 192.168.40.37: ICMP echo reply, id 20777, seq 1, length 64
14:00:33.890188 IP 1.1.1.1 > 192.168.40.37: ICMP echo reply, id 20777, seq 2, length 64
14:00:34.890410 IP 1.1.1.1 > 192.168.40.37: ICMP echo reply, id 20777, seq 3, length 64
14:00:35.625365 IP truncated-ip - 4 bytes missing! 1.1.1.1 > 192.168.40.37: ICMP echo reply, id 20777, seq 3, length 64
14:00:35.892822 IP 1.1.1.1 > 192.168.40.37: ICMP echo reply, id 20777, seq 4, length 64
Thanks for replying Derek! Yes, bridge mode. I ran my ICMP test and you can see one set of duplicates in the output.

[rn23@simplerockbuild ~]$ ip a

1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever

2: enp0s3: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:38:60:2e brd ff:ff:ff:ff:ff:ff
inet 192.168.40.88/24 brd 192.168.40.255 scope global noprefixroute dynamic enp0s3
valid_lft 86377sec preferred_lft 86377sec

3: enp0s8: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:f4:6b:f6 brd ff:ff:ff:ff:ff:ff

4: enp0s9: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:39:9a:5b brd ff:ff:ff:ff:ff:ff

[rn23@simplerockbuild ~]$ sudo tcpdump -n -i enp0s8 icmp and host 1.1.1.1

[sudo] password for rn23:

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp0s8, link-type EN10MB (Ethernet), capture size 262144 bytes

13:56:38.710928 IP 1.1.1.1 > 192.168.40.42: ICMP echo reply, id 1, seq 1, length 40
13:56:39.712030 IP 1.1.1.1 > 192.168.40.42: ICMP echo reply, id 1, seq 2, length 40
13:56:39.730583 IP 1.1.1.1 > 192.168.40.42: ICMP echo reply, id 1, seq 2, length 40
13:56:40.717075 IP 1.1.1.1 > 192.168.40.42: ICMP echo reply, id 1, seq 3, length 40
13:56:41.719322 IP 1.1.1.1 > 192.168.40.42: ICMP echo reply, id 1, seq 4, length 40
^C
5 packets captured
11 packets received by filter
0 packets dropped by kernel

[rn23@simplerockbuild ~]$ ethtool -i enp0s8

driver: e1000
version: 7.3.21-k8-NAPI
firmware-version:
expansion-rom-version:
bus-info: 0000:00:08.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

Popular posts from this blog

Zeek in Action Videos

New Book! The Best of TaoSecurity Blog, Volume 4

MITRE ATT&CK Tactics Are Not Tactics