New Honeynet Project Challenge

I saw that the Honeynet Project announced a new Scan of the Month last week. The evidence consists of Apache logs, Linux syslogs, Snort logs, and IPTables firewall logs. Here are examples.

From the Apache access log: - - [13/Mar/2005:04:05:47 -0500]
"POST /_vti_bin/_vti_aut/fp30reg.dll HTTP/1.1" 404 1063 "-" "-"

From the /var/log/messages syslog:

Mar 13 22:50:53 combo sshd(pam_unix)[9356]:
authentication failure; logname= uid=0 euid=0
tty=NODEVssh ruser= user=root

From the Snort logs, apparently captured via syslog:

Feb 25 12:21:33 bastion snort: [1:483:5] ICMP PING CyberKit 2.2 Windows
[Classification: Misc activity] [Priority: 3]: {ICMP} ->

Finally, from the IPTables logs:

Feb 25 12:11:24 bridge kernel: INBOUND TCP: IN=br0
PHYSIN=eth0 OUT=br0 PHYSOUT=eth1
SRC= DST= LEN=64 TOS=0x00
PREC=0x00 TTL=47 ID=17159 DF

Let's see how much data is in each of the logs. I used 'wc' to count lines in each of the sets of logs.

  • Apache: 7,620

  • Syslog: 3,925

  • Snort: 69,039

  • IPTables: 179,752

  • Total: 260,336

So, we have over 260,000 lines of log entries to review. This seems fairly crazy to me. As a NSM practitioner who advocates collecting session and full content data, I am often criticized by those who consider it too difficult or expensive to collect such forms of network evidence. This Scan of the Month presents the alternative -- working though line after line of text-based log entries. Now what is more expensive, in terms of time and resources?

You might say I would have the same problem analyzing this intrusion using NSM techniques. You might believe Snort would yield the same number of alerts whether configured to emit text-based records via syslog or alerts for presentation by Sguil.

I guarantee I could determine if the system was compromised, and by how many parties, faster using NSM techniques than manual log analysis.

I would also know exactly what network traffic the intruder launched against the target, regardless of whether or not it triggered a Snort alert. I would not have to look at text-based IPTables representations of packet movement. I could instead look at session data, which summarizes the thousands of packets in a flow into a single record.

I believe the winner of this SotM will end up being a Perl or Awk wizard who can parse the logs efficiently to reduce the number of lines to be analyzed.

This is still a useful challenge. If there is any data available at all after a compromise, it is often in the form of Web logs, syslogs, and so on. It is important to know how to interpret such evidence, if that is all there is to analyze. Still -- imagine the possibilities when NSM-based evidence is collected!


Anonymous said…

You might want to try using SEC (Simple Event Correlator) for this log parsing task. One of my colleagues used this tool for a similar situation (during our SANS GIAC course to parse all the snort logs they distribute). Worked great!


Here's a great write up by Jim Brown on using SEC :

And Chris' Posted practical if you'd like to see how he used it to parse his snort logs.
ACK, thank you.

Popular posts from this blog

MITRE ATT&CK Tactics Are Not Tactics

Zeek in Action Videos

New Book! The Best of TaoSecurity Blog, Volume 4