Thursday, November 06, 2008

Defining Security Event Correlation

This my final post discussing security event correlation (SEC) for now. (When I say SAC I do not mean the Simple Event Correlator [SEC] tool.)

Previously I looked at some history regarding SEC, showing that the ways people thought about SEC really lacked rigor. Before describing my definition of SEC, I'd like to state what I think SEC is not. So, in my opinion -- you may disagree -- SEC is not:

  1. Collection (of data sources): Simply putting all of your log sources in a central location is not correlation.

  2. Normalization (of data sources): Converting your log sources into a common format, while perhaps necessary for correlation (according to some), is not correlation.

  3. Prioritization (of events): Deciding what events you most care about is not correlation.

  4. Suppression (via thresholding): Deciding not to see certain events is not correlation.

  5. Accumulation (via simple incrementing counters: Some people consider a report that one has 100 messages of the same type to be correlation. If that is really correlation I think your standards are too low. Counting is not correlation.

  6. Centralization (of policies): Applying a single policy to multiple messages, while useful, is not correlation itself.

  7. Summarization (via reports): Generating a report -- again helpful -- by itself is not correlation. It's counting and sorting.

  8. Administration (of software): Configuring systems is definitely not correlation.

  9. Delegation (of tasks): Telling someone to take action based on the above data is not correlation.

So what is correlation? In my last post I cited Greg Shipley, who said if the engine sees A and also sees B or C, then it will go do X. That seems closer to what I consider security event correlation. SEC has a content component (what happened) and a temporal component (when did it happen). Using those two elements you can accomplish what Greg says.

I'd like to offer the following definition, while being open to other ideas:

Security event correlation is the process of applying criteria to data inputs, generally of a conditional ("if-then") nature, in order to generate actionable data outputs.

So what about the nine elements are listed? They all seem important. Sure, but they are not correlation. They are functions of a Security Information and Event Management (SIEM) program, with correlation as one component. So, add correlation as item 10, and I think those 10 elements encompass SIEM well. This point is crucial:

SIEM is an operation, not a tool.

You can buy a SIEM tool but you can't buy a SIEM operation. You have to build a SIEM operation, and you may (or may not) use a SIEM to assist you.

Wait, didn't Raffy say SIM is dead? I'll try to respond to that soon. For now let me say that the guiding principle for my own operation is the following:

Not just more data; the right data -- fast, flexible, and functional.


djb said...


Thanks for sharing your insights publicly. I often share and advocate many of the observations you document in your blog. I am interested to know what you think about deep packet inspection as it relates to Security Event Correlation.

From GCN

Leveraging deep packet inspection
Deep packet inspection applications offer agency IT managers improved tools to monitor and secure agency networks.

Seth Hall said...

Exactly! Thanks for this post, hopefully it will help start some more discussion within the industry about what correlation really means.

And extremely importantly in my mind..
Not just more data; the right data -- fast, flexible, and functional.

I really like the "Google" approach of just throwing everything in a big pile and then searching there. Like you mentioned though, if you're going to have a big pile you might as well put the best data possible into it. That seems to be the general direction that your thinking is headed.

Toady said...

My actual definition for IDS correlation is the following:
"Transform one or several alerts into an attack".

What do we want to get from those IDS? attacks.

Sometime correlation occurs when alerts are enriched from an informational source, sometime one alert is enough to discover an attack, sometime you need to follow steps or it can even be because of events storm.

I gave a presentation for Cansecwest 2008 on the subject. Slides are available here:

There are a lot of things to say about the correlation (buzz?)word. Sec is one side of the answer.

Richard Bejtlich said...

DPI is nothing but a buzzword to repackage old ideas.

Anonymous said...

My take is that correlation implies cross-device analysis, which is meant to eliminate the "island view" problem that plagues the IDS/IPS with so many false positives. Using this logic, prioritization implies correlating data from various sources:
Vulnerability Scans - V
Target Asset Criticality - T
History of the Attack source - H

For example, a typical formula to calculate the priority based on the criticality of an attack C would be:

P = C x V x T x H, where the
C = [0 thru 1]
V = [0 thru 1]
T = [.5 - 1.5]
H = [1 thru 1.5]

We can see that P can be decreased only by vulnerability data (detected attack is not relevant), and by the target value (target value is low). What is the meaning of this formula?
If the SIEM is used without Vulnerability data and Asset management the false positive issue is going to be accentuated. A SIEM that is not fed with intelligence is worse than IDS.

Roman said...

Anonymous: A SIEM that is not fed with intelligence is worse than IDS.

Actually, I believe the point is that a SIEM that is not manned by (an) intelligence is worthless. Humans correlate information into actionable items; SIEMs can only provide data to that human. If the SIEM isn't providing the right data to that human, the SIEM is worthless. If you don't have a human handling the correlation, you don't have correlation.

Compare this with creating intelligence (the military kind; not the "intelligence" inside the SIEM): you have collector platforms that pull in all kinds of data, and there are technical processes that munge and massage it a bit, but until an intelligence analyst gets ahold of that information and correlates/processes it, it remains information. Once correlated and processed, it becomes intelligence, and thus is now actionable (as long as your analysts are getting the job done).