Note: I call this post Operational Traffic Intelligence System Woes because I want it to apply to detecting and resisting intrusions. As I mentioned earlier, hardly anyone builds real intrusion detection systems. So-called "IDS" are really attack indication systems. I also dislike the term intrusion prevention system ("IPS"), since anything that seeks to resist intrusion could be considered an "IPS." Most available "IPS" are firewalls in the sense that anything that denies activity is a policy enforcement system. I use the term traffic intelligence system (TIS) to describe any network-centric product which inspects traffic for detection or resistance purposes. That includes products with the popular labels "firewall," "IPS," and "IDS."
Three main criticisms can be made against TIS. I could point to many references but since this is a blog post I'll save that heavy lifting for something I write for publication.
- Failure to Understand the Environment: This problem is old as dirt and will never be solved. The root of the issue is that in any network of even minimal size, it is too difficult for the TIS to properly model the states of all parties. The TIS can never be sure how a target will decipher and process traffic sent by an intruder, and vice versa. This situation leaves enough room for attacks to drive a Mac truck, e.g., they can fragment at the IP / TCP / SMB / DCE-RPC levels and confuse just about every TIS available, while the target happily processes what it receives. Products that gather as much context about targets improve the situation, but there are no perfect solutions.
- Analyst Information Overload: This problem is only getting worse. As attackers devise various ways to exploit targets, TIS vendors try to identify and/or deny malicious activity. For example, Snort's signature base is rapidly approaching 10,000 rules. (It's important to realize Snort is not just a signature-based IDS/IPS. I'll explain why in a future Snort Report.) The information overload problem means it's becoming increasingly difficult (if not already impossible) for security analysts to understand all of the attack types they might encounter while inspecting TIS alerts. SIM/SEM/SIEM vendors try mitigate this problem by correlating events, but at the end of the day I want to know why a product is asking me to investigate an alert. That requires drilling down to the individual alert level and understanding what is happening.
- Lack of Supporting Details: The vast majority of TIS continue to be alert-centric. This is absolutely crippling for a security analyst. I am convinced that the vast majority of TIS developers never use their products on operational networks supporting real clients with contemporary security problems. If they did, developers would quickly realize their products do not provide the level of detail needed to figure out what is happening.
In brief, we have TIS that don't/can't fully understand their environment, reporting more alerts than an analyst can understand, while providing not enough details to satisfy operational investigations. I did not even include usability as a critical aspect of this issue.
How does this apply to MARS? It appears that MARS (like other SIM/SEM/SIEM) believes that the "more is better" approach is the way to address the lack of context. The idea is that collecting as many input sources as possible will result in a system that understand the environment. This works to a certain limited point, but what is really needed is comprehensive knowledge of a target's existence, operating system, applications, and configuration. That level of information is not available, so I was left with inspecting 209 "red" severity MARS alerts for the last 7 days (3099 yellow, 1672 green). Those numbers also indicate information overload to me. All I really want to know is which of those alerts represent intrusions? MARS (and honestly, most products) can't answer that question.
The way I am usually forced to determine if I should worry about TIS alerts is manual inspection. The open source project Sguil provides session and full content data -- independent of any alert -- that lets me know a lot about activity directly or indirectly related to an event of interest. With MARS and the like, I can basically query for other alerts. Theoretically NetFlow can be collected, but the default configuration is to collect NetFlow for statistical purposes while discarding the individual records.
If I want to see full content, the closest I can get is this sort of ASCII rendition of a packet excerpt. That is ridiculous; it was state-of-the-art in 1996 to take a binary protocol (say SMB -- not shown here but common) and display the packet excerpt in ASCII. That level of detail gives the analyst almost nothing useful as far as incident validation or escalation.
(Is there an alternative? Sure, with Sguil we extract the entire session from Libpcap and provide it in Ethereal/Wireshark, or display all of the content in ASCII if requested by the analyst.)
The bottom line is that I am at a loss regarding what I am going to tell my client. They spent a lot of money deploying a Cisco SDN but their investigative capabilities as provided by MARS are insufficient for incident analysis and escalation. I'm considering recommending augmentation with a separate product that collections full content and session data, then using the MARS as tip-off for investigation using those alternative data sources.
Are you stuck with similar products? How do you handle the situation? Several of you posted ideas earlier, and I appreciate hearing more.
Copyright 2007 Richard Bejtlich