Tuesday, July 29, 2008

Security Operations: Do You CAER?

Security operations can be reduced to four main tasks. A mature security operation performs all four tasks proactively.

  1. Collection is the process of acquiring the evidence necessary to identify activity of interest.

  2. Analysis is the process of investigating evidence to identify suspicious and malicious activity that could qualify as incidents.

  3. Escalation is the process of reporting incidents to responsible parties.

  4. Resolution is the process of handling incidents to recover to a desired state.


The goal of every mature security operation is to reduce the mean time to resolution, i.e., accomplishing all four tasks as quickly and efficiently as possible.

As has been noted elsewhere (e.g., Anton Chuvakin's Blog), some organizations which aren't even performing collection yet view achieving that first step as their end goal. ("Whew, we got the logs. Check!") Collecting evidence is no easy task, to be sure. Increasingly the logs generated by security devices are less relevant to real investigations. Application-level data can sometimes be the only way to understand what is happening, yet programmers aren't really practicing security application instrumentation by building visibility in (yet).

Various regulatory frameworks are beginning to drive recalcitrant organizations further into security operations by requiring analysis and not just collection. Besides meeting legal requirements, it should be obvious that identifying security failures as early as possible reduces the ultimate cost of resolving those problems, just as purging bugs from software early in the development process is cheaper than developing patches for software in the field. Competent analysis is probably the most difficult aspect of security operations. Understanding applications, the environment, and attack models is increasingly difficult, and the human resources to perform this task well are seldom inexpensive nor willing to relocate in large numbers.

Assuming one has the capability to do decent enough analysis to discover trouble, knowing whom to notify (escalation) becomes the next step. In a large organization this is no trivial task. Simply performing asset inventory, naming responsible parties, and establishing incident response procedures is a project unto itself. Worse, none of these details are static. Any system which depends upon administrators to manually enumerate their networks, data, systems, applications, and personal information will become stale within days. These processes should be automated, so a human incident handler can escalate without wasting time tracking down missing information.

Finally we come to resolution. Several problems arise here. First, the "responsible" party may deny the incident, despite evidence to the contrary. Although providing evidence may help, in some cases the "responsible" party may ignore the incident handler while quietly recovering from the event. Second, the "responsible" party may ignore the incident. The person may simply not care at all, or not care enough to direct the resources needed to resolve the incident. Third, the responsible party may want to resolve the incident, but may not be politically or technically capable for doing so. All three cases justify giving incident handlers the authority and knowledge to guide incident resolution as needed, to include deploying an augmentation team in serious cases.

Note that an organization may be forced to do escalation and resolution even when it does no collection or analysis. External parties, like law enforcement, the military, customers, regulators, and peers are frequently informing organizations that they have been compromised. This is a very poor situation, because a victim not doing independent collection and analysis has few options when the cops come knocking. Usually administrators must scramble to salvage whatever data might exist, wasting time as log sources (which may not even exist) are located. Resources to make sense of the data are lacking, so the victim is helpless. Management unwilling to support a security operation are going to be flummoxed when confronted by a serious incident. More time will be wasted. The only winner is the intruder.

Avoiding this situation requires us to fully CAER. It's cheaper, faster, and at some point I believe will be demanded by the government and market anyway.

3 comments:

James Turnbull said...

I am not sure you can call that Security Operations.

Where does enforcement fit? For us SecOps is also the management of security controls like endpoints, firewalls, gateways, access and authorisation, etc, etc.

I'd called what you describe "Security Response" or some kind of CERT function.

Matt Franz said...

Funny how *everything* (if you stare hard enough) ends up looking like the Intelligence Cycle.

Anonymous said...

I was very surpirsed with someone inventing a similar process, but with a different end. I was thinking about integrating the system life cycle of Information Systems into intelligence operations weeks ago. Did you have Information Systems in mind when you thought about this?