Monday, April 18, 2005

New Honeynet Project Challenge

I saw that the Honeynet Project announced a new Scan of the Month last week. The evidence consists of Apache logs, Linux syslogs, Snort logs, and IPTables firewall logs. Here are examples.

From the Apache access log:

210.116.59.164 - - [13/Mar/2005:04:05:47 -0500]
"POST /_vti_bin/_vti_aut/fp30reg.dll HTTP/1.1" 404 1063 "-" "-"


From the /var/log/messages syslog:

Mar 13 22:50:53 combo sshd(pam_unix)[9356]:
authentication failure; logname= uid=0 euid=0
tty=NODEVssh ruser= rhost=h-67-103-15-70.nycmny83.covad.net user=root


From the Snort logs, apparently captured via syslog:

Feb 25 12:21:33 bastion snort: [1:483:5] ICMP PING CyberKit 2.2 Windows
[Classification: Misc activity] [Priority: 3]: {ICMP}
70.81.243.88 -> 11.11.79.100


Finally, from the IPTables logs:

Feb 25 12:11:24 bridge kernel: INBOUND TCP: IN=br0
PHYSIN=eth0 OUT=br0 PHYSOUT=eth1
SRC=220.228.136.38 DST=11.11.79.83 LEN=64 TOS=0x00
PREC=0x00 TTL=47 ID=17159 DF
PROTO=TCP SPT=1629 DPT=139 WINDOW=44620 RES=0x00 SYN URGP=0


Let's see how much data is in each of the logs. I used 'wc' to count lines in each of the sets of logs.

  • Apache: 7,620

  • Syslog: 3,925

  • Snort: 69,039

  • IPTables: 179,752

  • Total: 260,336


So, we have over 260,000 lines of log entries to review. This seems fairly crazy to me. As a NSM practitioner who advocates collecting session and full content data, I am often criticized by those who consider it too difficult or expensive to collect such forms of network evidence. This Scan of the Month presents the alternative -- working though line after line of text-based log entries. Now what is more expensive, in terms of time and resources?

You might say I would have the same problem analyzing this intrusion using NSM techniques. You might believe Snort would yield the same number of alerts whether configured to emit text-based records via syslog or alerts for presentation by Sguil.

I guarantee I could determine if the system was compromised, and by how many parties, faster using NSM techniques than manual log analysis.

I would also know exactly what network traffic the intruder launched against the target, regardless of whether or not it triggered a Snort alert. I would not have to look at text-based IPTables representations of packet movement. I could instead look at session data, which summarizes the thousands of packets in a flow into a single record.

I believe the winner of this SotM will end up being a Perl or Awk wizard who can parse the logs efficiently to reduce the number of lines to be analyzed.

This is still a useful challenge. If there is any data available at all after a compromise, it is often in the form of Web logs, syslogs, and so on. It is important to know how to interpret such evidence, if that is all there is to analyze. Still -- imagine the possibilities when NSM-based evidence is collected!

2 comments:

Geoff Sanders said...

Richard:

You might want to try using SEC (Simple Event Correlator) for this log parsing task. One of my colleagues used this tool for a similar situation (during our SANS GIAC course to parse all the snort logs they distribute). Worked great!

SEC: http://kodu.neti.ee/~risto/sec/

Here's a great write up by Jim Brown on using SEC : http://sixshooter.v6.thrupoint.net/SEC-examples/article.html

And Chris' Posted practical if you'd like to see how he used it to parse his snort logs.

http://www.giac.org/certified_professionals/practicals/gcia/0650.php

Richard Bejtlich said...

ACK, thank you.