Thursday, January 10, 2008

Defensible Network Architecture 2.0

Four years ago when I wrote The Tao of Network Security Monitoring I introduced the term defensible network architecture. I expanded on the concept in my second book, Extrusion Detection. When I first presented the idea, I said that a defensible network is an information architecture that is monitored, controlled, minimized, and current. In my opinion, a defensible network architecture gives you the best chance to resist intrusion, since perfect intrusion prevention is impossible.

I'd like to expand on that idea with Defensible Network Architecture 2.0. I believe these themes would be suitable for a strategic, multi-year program at any organization that commits itself to better security. You may notice the contrast with the Self-Defeating Network and the similarities to my Security Operations Fundamentals. I roughly order the elements in a series from least likely to encounter resistance from stakeholders to most likely to encounter resistance from stakeholders.

A Defensible Network Architecture is an information architecture that is:

  1. Monitored. The easiest and cheapest way to begin developing DNA on an existing enterprise is to deploy Network Security Monitoring sensors capturing session data (at an absolute minimum), full content data (if you can get it), and statistical data. If you can access other data sources, like firewall/router/IPS/DNS/proxy/whatever logs, begin working that angle too. Save the tougher data types (those that require reconfiguring assets and buying mammoth databases) until much later. This needs to be a quick win with the data in the hands of a small, centralized group. You should always start by monitoring first, as Bruce Schneier proclaimed so well in 2001.

  2. Inventoried. This means knowing what you host on your network. If you've started monitoring you can acquire a lot of this information passively. This is new to DNA 2.0 because I assumed it would be already done previously. Fat chance!

  3. Controlled. Now that you know how your network is operating and what is on it, you can start implementing network-based controls. Take this anyway you wish -- ingress filtering, egress filtering, network admission control, network access control, proxy connections, and so on. The idea is you transition from an "anything goes" network to one where the activity is authorized in advance, if possible. This step marks the first time where stakeholders might start complaining.

  4. Claimed. Now you are really going to reach out and touch a stakeholder. Claimed means identifying asset owners and developing policies, procedures, and plans for the operation of that asset. Feel free to swap this item with the previous. In my experience it is usually easier to start introducing control before making people take ownership of systems. This step is a prerequisite for performing incident response. We can detect intrusions in the first step. We can only work with an asset owner to respond when we know who owns the asset and how we can contain and recover it.

  5. Minimized. This step is the first to directly impact the configuration and posture of assets. Here we work with stakeholders to reduce the attack surface of their network devices. You can apply this idea to clients, servers, applications, network links, and so on. By reducing attack surface area you improve your ability to perform all of the other steps, but you can't really implement minimization until you know who owns what.

  6. Assessed. This is a vulnerability assessment process to identify weaknesses in assets. You could easily place this step before minimization. Some might argue that it pays to begin with an assessment, but the first question is going to be: "What do we assess?" I think it might be easier to start disabling unnecessary services first, but you may not know what's running on the machines without assessing them. Also consider performing an adversary simulation to test your overall security operations. Assessment is the step where you decide if what you've done so far is making any difference.

  7. Current. Current means keeping your assets configured and patched such that they can resist known attacks by addressing known vulnerabilities. It's easy to disable functionality no one needs. However, upgrades can sometimes break applications. That's why this step is last. It's the final piece in DNA 2.0.


So, there's DNA 2.0 -- MICCMAC (pronounced "mick-mack"). You may notice the Federal government is adopting parts of this approach, as mentioned in my post Feds Plan to Reduce, then Monitor. I prefer to at least get some monitoring going first, since even incomplete instrumentation tells you what is happening. Minimization based on opinion instead of fact is likely to be ugly.

Did I miss anything?

9 comments:

Anonymous said...

Great article. I work for a VERY large organisation. Step 4 would be the most difficult to carry out of all the steps you mention here. Do you have any suggestions on dividing this up into easier to manage segments?

Eric Appelboom said...

Hi Richard.
I would add reported on.
Would be nice if you could make a PDF poster with the different processes.

Eric Appelboom

dre said...

Network visibility is so great. I really do like it. Yet I see problems, and will continue to question it. I'm losing my religion.

A monitored network can mean a lot of things to a lot of different people. For some it means Sourcefire RNA, Tenable PVS, and Lancope on every network segment. Others go for a cheaper solution using snort, Argus, Netstate, and Ourmon. For those that manage good PKI, they can get away with running ssldump (or Bluecoat) to sniff inside encrypted packets.

I'm shifting away from this, but not away from visibility. With tools like UHooker, Echo Mirage, ProcMon/FileMon/RegMon/etc... in addition to my old standbys (strace, ktrace, truss, etc) - I think GPO's and cfengines can gather up tons of visibility information. We are starting to see this in DLP products, where they are getting as close to the applications as possible.

Doing any of this at a large organization is quite difficult, especially if thick clients are in use. Also difficult is dealing with enormous amounts of network traffic that you don't want to convert to less meaningful data. Sampled Netflow comes to mind right away to be the choice example of what I'm talking about. For full-visibility, you don't want a sample of the data. You don't want aggregated data. Argus (or Niksun) are great for aggregating data, and keeping Ourmon graphs around for years is not a big deal compared to keeping full packet captures, but most organizations have to draw a line somewhere.

This is where the battle of taps and mirror ports also comes into play. After all these years, I see no reference architecture, no blueprints, and no guidelines for packet capture infrastructure. To me, this means that capture-based network visibility is dead. Let's move to DLP agents, and/or system/library call tracing and be done with it. I hate to be a dick, but Richard - I really think you need to put up or shut up.

I've always been a proponent of Netflow (or Slow, cflow, etc) in addition to Ourmon without sampling, even on large capacity links. I like snort, but I don't like expense. Hence, I'm ok with sampling and ERSPAN for packets a snort engine would be fed. Bro and Preclude-IDS also have their place for medium/high assurance parts of critical networks.

What I would rather see is Samhain, the99lb, Zeppoo, rkhunter, and chkrootkit on every Linux image. I want IceSword and RKUnhooker for every Windows image. In addition to some cross-platform rpcapd's loaded via GPO's and cfengines, I also want some UHooker and strace dumps of every process. Make it twice as nice on web and database servers. Bring it all back to OSSEC and Zenoss. Monitor snort alerts with Cerebus and have the snort logs dumped as snaplen 0 pcap output. UltraVNC access with proper ssh keys should somehow be enabled on any box (Linux, Windows, et al) with the flick of the wrist and some DNSSEC magic.

Tell everybody how it is and say it like you mean it. Sorry to say it, but being an operator for a large installation (as opposed to the noisy consultant you used to be) has turned you soft.

Tiago said...

Hi Richard,

I would make the Inventoried step before the Monitored.
I think that first we need to know what we have and then monitor the network.

Making the monitoring after the inventoried makes possible to find some assets that we didn't detect in the inventoried.
What do you think about this?

Anonymous said...

Great list of what needs to be done. I've tried to preach #5 for years but have frequently run into push back from system administrators, vendors and developers all of whom can't be bothered to know what resources on a system a given function actually needs.

Anonymous said...

GREAT Article, although I would remove the past tense (remove the "d") as each is an ongoing process. Also I would change "Claimed" to "Culture" The task is not only to capture the stakeholders and assign responsibilty it is to educate the user. Too frequently in computing environments users want to blame IT or Information Assurance rather than look at the risks to implementing systems with controls. Users are dumb they are just ignorant to the issues involved, if you educate them and keep them situationally aware (involved) the culture changes and improves the security posture of your environment. As Culture you might even bump it to #3 as it allows the user to buy into the system.

Göran Sandahl said...

I'd also add Measured as a last point. Given that a company and a network fulfills DNA 2.0, and as such have details of incidents and events, all the assets and all vulnerabilities - are they more secure today than they were yesterday, last week, last month or last year? Are more or less incidents escalated and manually verified? Has more or less incidents led to host-based actions such as forensics or re-image? Are more or less indications being processed? What is the ratio between indications and actual incidents? Has the ability to detect, prevent and respond to simulated incidents, aka penetration tests, increased or decreased?

Secondly, put this in relationship to environmental changes such as the increase/decrease in exchange of traffic and an increase/decrease in the number of assets currently maintained and defended. If possible, compare incidents per megabyte of exchanged traffic or incidents per asset, and other metrics, with peers.

Ultimately, by following DNA 2.0, have we improved our ability in terms of detecting, preventing and responding to attacks? How does this compare to our peers, and have we improved compared to them?

What do you think?

Cheers
Göran Sandahl

Richard Bejtlich said...

Dre,

I'll assume you were smoking something when you wrote that post and ignore your attitude. Are you for real?

Anonymous talking about past tense:

I wrote "A Defensible Network Architecture is an information architecture that is:"

All of these are happening now, not once in the past.

Göran,

"Measured" is a great idea. I guess that's step 8?

Michael Markulec said...

Awesome stuff, I've been beating the same drum for the last several years with our clients. It's great to see a framework that is both simple and complete.

I would offer plug for a defense in depth bullet. As more and more devices and control system are being added to corporate and government networks there is a growing need to identify high valued assets and provide them additional layers of protection. Add to that the need to provide both command and control and well as auditing of all network device and traffic; and you have a compelling argument for a comprehensive defense in depth necessity.