Wednesday, January 31, 2007

I Am Not Anti-Log

Some of you who rely on various system and application logs might take exception to my emphasis on interpreting network traffic. You might think I am "anti-log." That is absolutely not true. I will demonstrate a case that shows I appreciate logs in certain situations.

Last night I was analyzing alert data collected from one of the customers I monitor. One of the Snort alerts I saw (a bleeding-exploit.rules entry) indicated BLEEDING-EDGE EXPLOIT Possible MSIE VML Exploit. This did not look promising, especially since I was not flooded with these events. In other words, if I had seen 100, I would not be 100 times more worried than if I saw only one alert. The fact that I was investigating a single alert made me think this signature might be deadly accurate.

I am not going to walk through the entire investigation for this event. Suffice it to say I wanted to know if the victim system was truly exploited. I eventually found myself looking at transcripts of traffic and the traffic itself. I duplicated part of the activity on my own system so I could show you what that might look like without revealing client data.
No, I did not visit a dating site for fun. Neither did my client. Prior to this I saw nothing indicating those sorts of interests, so I'm guessing this was an unintentional case.

The point is that it's much easier to understand a victim's Web browsing (if that's the crucial aspect of the investigation) if Web proxy logs are available, like these:

1170223023.601 388 192.168.2.5 TCP_MISS/404 558
GET http://back88008800.com/ -
DIRECT/81.95.146.166 text/html
1170223024.219 318 192.168.2.5 TCP_MISS/404 569
GET http://back88008800.com/favicon.ico -
DIRECT/81.95.146.166 text/html
1170223028.897 390 192.168.2.5 TCP_MISS/200 797
GET http://back88008800.com/dating.html -
DIRECT/81.95.146.166 text/html
1170223061.677 344 192.168.2.5 TCP_REFRESH_HIT/304 240
GET http://back88008800.com/dating.html -
DIRECT/81.95.146.166 -
1170223062.070 355 192.168.2.5 TCP_MISS/200 1946
GET http://back88008800.com/script.js -
DIRECT/81.95.146.166 application/x-javascript
1170223062.329 123 192.168.2.5 TCP_MISS/302 438
GET http://www.worlddatinghere.com/? -
DIRECT/63.218.226.67 text/html
1170223062.463 392 192.168.2.5 TCP_MISS/302 696
GET http://81.95.146.133/sutra/in.cgi? -
DIRECT/81.95.146.133 text/html
1170223062.802 339 192.168.2.5 TCP_MISS/200 4084
GET http://81.95.146.133/sp/sp2/index.php -
DIRECT/81.95.146.133 text/html

These are my personal Squid Web cache logs, which tracked my investigation. I like these logs because they cut right to the heart of the matter, namely what sites were visited as part of this event.

While analyzing this case I also had access to session data, like this.

Session data is great because it shows me everything that happened, regardless of whether it involved a logging application (like a Web proxy) or not. However, to get at the details I would need to generate transcripts, like this.

My point is that sometimes it's helpful to work with an application-specific log, like a Squid Web proxy log, instead of rebuilding everything from traffic.

Speaking of Squid, I found that the default /etc/logrotate.d/squid entry which controls /var/log/access.log rotation, contains this:

#
# Logrotate fragment for squid.
#
/var/log/squid/*.log {
daily
compress
delaycompress
rotate 2
missingok
nocreate
sharedscripts
prerotate
test ! -x /usr/sbin/sarg-maint || /usr/sbin/sarg-maint
endscript
postrotate
test ! -e /var/run/squid.pid || /usr/sbin/squid -k rotate
endscript
}

I decided to change the "rotate 2" to "rotate 30" to give me 30 days of logs. Remember this is my own network's setting, where I was duplicating my client's experience for your blog reading enjoyment. As far as my client was concerned, I did not find any evidence of compromise after checking my session and full content data for suspicious post-alert activity.

8 comments:

Anonymous said...

Good post, however, I don't understand why you would of liked to of seen a lot of alerts instead of just one. Can you help me to understand why?

Anonymous said...

Richard,

What can you tell us about that utility that looks like it's written in Tcl shown in blog_log_case_transcript.jpg?

Richard Bejtlich said...

I said I was worried because I saw one alert, not one hundred. 100 indicates a loose signature, while 1 indicates a tight signature.

Richard Bejtlich said...

The transcript was created using Tcpflow, which Sguil calls to rebuild the full content from the session or alert of interest.

Anonymous said...

That makes sense, thanks Richard.

Sometwo said...

The value is not in the data, but in the structure
Years ago, some clever person calculated that if you reduce human beings to their components of carbon, hydrogen, oxygen, and nitrogen atoms, they would be worth only 97 cents. However droll this assessment, it's misleading. People aren't composed of mere isolated collections of atoms. Our atoms combine into enzymes, proteins, hormones, and many other substances that would cost millions of dollars per ounce on the pharmaceutical market. The precise structure of these combinations of atoms is what gives them greater value. By analogy, database structure makes possible the interpretation of seemingly meaningless data. The structure brings to the surface patterns, trends, and tendencies in the data. Unstructured data -- like uncombined atoms -- has little or no value.
-- SQL For Dummies 6th edition

I found this an interesting quote I read today. The more structure the data, the more valuable it is. Flat text files such as logs have little or no structure compared to NSM data when put into a database, and thus aren't as valuable.

Anonymous said...

..it's log, it's log. It's big, it's heavy, it's wood...

Sorry.. just couldn't fight the urge.

Anonymous said...
This comment has been removed by a blog administrator.