Monday, November 26, 2007

Controls Are Not the Solution to Our Problem

If you recognize the inspiration for this post title and graphic, you'll understand my ultimate goal. If not, let me start by saying this post is an expansion of ideas presented in a previous post with the succinct and catchy title Control-Compliant vs Field-Assessed Security.

In brief, too many organizations, regulators, and government agencies waste precious time and resources devising and auditing "controls," regardless of the effect these controls have or do not have on security. They are far too input-centric; they should become more output-aware. They obsess over recording conditions they believe may be helpful while remaining ignorant of the "score of the game." They practice management by belief and disregard management by fact.

Let me provide a few examples from one of the canonical texts used by the control-compliant crowd: NIST Special Publication 800-53: Recommended Security Controls for Federal Information Systems (.pdf). The following is an example of a control, taken from page 140.

SI-3 MALICIOUS CODE PROTECTION


The information system implements malicious code protection.

Control: Supplemental Guidance: The organization employs malicious code protection mechanisms at critical information system entry and exit points (e.g., firewalls, electronic mail servers, web servers, proxy servers, remote-access servers) and at workstations, servers, or mobile computing devices on the network. The organization uses the malicious code protection mechanisms to detect and eradicate malicious code (e.g., viruses, worms, Trojan horses, spyware) transported: (i) by electronic mail, electronic mail attachments, Internet accesses, removable media (e.g., USB devices, diskettes or compact disks), or other common means; or (ii) by exploiting information system vulnerabilities. The organization updates malicious code protection mechanisms (including the latest virus definitions) whenever new releases are available in accordance with organizational configuration management policy and procedures. The organization considers using malicious code protection software products from multiple vendors (e.g., using one vendor for boundary devices and servers and another vendor for workstations). The organization also considers the receipt of false positives during malicious code detection and eradication and the resulting potential impact on the availability of the information system. NIST Special Publication 800-83 provides guidance on implementing malicious code protection.

Control Enhancements:
(1) The organization centrally manages malicious code protection mechanisms.
(2) The information system automatically updates malicious code protection mechanisms.


At first read one might reasonably respond by saying "What's wrong with that? This control advocates implementing anti-virus and related anti-malware software." Think more clearly about this issue and several problems appear.

  • Adding anti-virus products can introduce additional vulnerabilities to systems which might not have exposed themselves without running anti-virus. Consider my post Example of Security Product Introducing Vulnerabilities if you need examples. In short, add anti-virus, be compromised.

  • Achieving compliance may cost more than potential damage. How many times have you heard a Unix administrator complain that he/she has to purchase an anti-virus product for his/her Unix server simply to be compliant with a control like this? The potential for a Unix server (not Mac OS X) to be damaged by a user opening an email through a client while logged on to the server (a very popular exploitation vector on a Windows XP box) is practically nil.

  • Does this actually work? This is the question that no one asks. Does it really matter if your system is running anti-virus software? Did you know that intruders (especially high-end ones most likely to selectively, steathily target the very .gov and .mil systems required to be compliant with this control) test their malware against a battery of anti-virus products to ensure their code wins? Are weekly updates superior to daily updates? Daily to hourly?


The purpose of this post is to tentatively propose an alternative approach. I called this "field-assessed" in contrast to "control-compliant." Some people prefer the term "results-based." Whatever you call it, the idea is to direct attention away from inputs and devote more energy to outputs. As far as mandating inputs (like every device must run anti-virus), I say that is a waste of time and resources.

I recommend taking measurements to determine your enterprise "score of the game," and use that information to decide what you need to do differently. I'm not suggesting abandoning efforts to prevent intrusions (i.e., "inputs.") Rather, don't think your security responsibilities end when the bottle is broken against the bow of the ship and it slides into the sea. You've got to keep watching to see if it sinks, if pirates attack, how the lifeboats handle rough seas, and so forth.

These are a few ideas.

  1. Standard client build client-side survival test. Create multiple sacrificial systems with your standard build. Deploy a client-side testing solution on them, like a honeyclient. (See The Sting for a recent story.) Vary your defensive posture. Measure how long it takes for your standard build to be compromised by in-the-wild Web sites, spam, and other communications with the outside world.

  2. Standard client build server-side survival test. Create multiple sacrificial systems with your standard build. Deploy them as a honeynet. Vary your defensive posture. Measure how long it takes for your standard build to be compromised by malicious external traffic from the outside world -- or better yet -- from your internal network.

  3. Standard client build client-side penetration test. Create multiple sacrificial systems with your standard build. Conduct my recommendation penetration testing activities and time the result.

  4. Standard client build server-side penetration test. Repeat number 3 with a server-side flavor.

  5. Standard server build server-side penetration test. Repeat number 3 against your server build with a server-side flavor. I hope you don't have users operating servers as if they were clients (i.e., browsing the Web, reading email, and so forth.) If you do, repeat this step and do a client-side pen test too.

  6. Deploy low-interactive honeynets and sinkhole routers in your internal network. These low-interaction systems provide a means to get some indications of what might be happening inside your network. If you think deploying these on the external network might reveal indications of targeted attacks, try that. (I doubt it will be that useful due to the overall attack noise, but who knows?)

  7. Conduct automated, sampled client host integrity assessments. Select a statistically valid subset of your clients and check them using multiple automated tools (malware/rootkit/etc. checkers) for indications of compromise.

  8. Conduct automated, sampled server host integrity assessments. Self-explanatory.

  9. Conduct manual, sampled client host integrity assessments. These are deep-dives of individual systems. You can think of it as an incident response where you have not had indication of an incident yet. Remote IR tools can be helpful here. If you are really hard-core and you have the time, resources, and cooperation, do offline analysis of the hard drive.

  10. Conduct manual, sampled server host integrity assessments. Self-explanatory.

  11. Conduct automated, sampled network host activity assessments. I questioned adding this step here, since you should probably always be doing this. Sometimes it can be difficult to find the time to review the results, however automated the data collection. The idea is to let your NSM system see if any of the traffic it sees is out of the ordinary based on algorithms you provide.

  12. Conduct manual, sampled network host activity assessments. This method is more likely to produce results. Here a skilled analyst performs deep individual analysis of traffic on a sample of machines (client and server, separately) to see if any indications of compromise appear.


In all of these cases, trend your measurements over time to see if you see improvements when you alter an input. I know some of you might complain that you can't expect to have consistent output when the threat landscape is constantly changing. I really don't care, and neither does your CEO or manager!

I offer two recommendations:

  • Remember Andy Jaquith's criteria for good metrics, simplified here.


    1. Measure consistently.

    2. Make them cheap to measure. (Sorry Andy, my manual tests violate this!)

    3. Use compound metrics.

    4. Be actionable.


  • Don't slip into thinking of inputs. Don't measure how many hosts are running anti-virus. We want to measure outputs. We are not proposing new controls.


Controls are not the solution to our problem. Controls are the problem. They divert too much time, resources, and attention from endeavors which do make a difference. If the indications I am receiving from readers and friends are true, the ideas in this post are gaining traction. Do you have other ideas?

13 comments:

Anonymous said...

Hi Richard

Great post but I would like to differ from the view that “controls are not the solution”. Controls are part of the solution but not the only solution. I am of the opinion that field assessments when performed accurately and timely will to a greater extent determine control effectiveness. The outputs you refer to, if I understand your argument, I consider also as an indication of whether the implemented “controls” provide a positive or negative measure of security if an assessment is carried out and not necessarily reliant on the fact that the controls are in place as required by numerous standards.

I agree with you that “control-compliant” focus is management by belief as there are no means to determine that these so-called controls are effective.

WM

Anonymous said...

Presumably Richard is not saying to abandon controls, but rather to test the h-ll out of them to see if they are realistically working, not just check boxes and go home. Then one can implement some controls selectively, or devise new ones that seem to reduce compromises, and tell one's bosses that something real has been achieved, no mean feat in this dreary business.

Frank

Lance Spitzner said...

I believe ISO 27001 may have a better approach, as it offers greater flexibility and breadth. Some concerns in the FISMA/NIST approach.

- First, NIST does not use a risk approach but IMPACT. You do not start with a risk assessment of your environment, you start with an IMPACT assessment (FIPS-199 and FIPS-200).

- Second, my concern is the NIST approach is its totally control focused. Thee is little discussion about processes, checks, learning, etc. These may be listed as a control, but it appears more of an after thought.

- Third, to be honest I find the NIST documentation to be confusing. The new SP800-39 document helps to give a better understanding of the overview, but once again NIST takes a very specific, targeted view of controls, it appears not to focus on the big picture. Its almost as if NIST does not trust the security professional and wants to hold their hand.

- One thing I do truly appreciate about NIST (and I've seen this mentioned around the world) is the technical SP800 documents, not so much on the control approach but details on specific controls (such as IDS, forensics, incidents response, etc).

lance

Prav said...

I’m actually very much for that control, in environments where a vast majority of other security controls are very lax, anti-virus poses more of a benefit for me than potential harm.
In a civilian agency I can point to the controls and use that as justification to have anti-virus on windows systems and *nix file servers.
With more resources at my disposal I would be interested in following the procedure you describe.

jw said...

I think a hybrid approach is necessary. Controls are needed in order to "filter out the noise" but relying on controls to do your job is a total joke...Richard's example of AV is a perfect example. Every day another user gets owned even though they update their AV signatures every day. Implementing a security control is the first step, actually monitoring and reacting to the control's output is another issue all together (and there is a cost associated with it). Ideally, all of your controls and effort monitoring controls will be cheaper than your cost of your potential damage. However, measuring potential damage is nearly impossible, how do you measure things such as reputation? So, my answer has a three pronged attack.

1. Selecting controls that are effective and that make sense.
2. Monitoring, detecting, and reacting to controls.
3. Testing the effectiveness of the controls with periodic (and perhaps random) field assessments / "survival, penetration, integrity tests".

Steven Andrés said...

Excellent post, Richard. As a consultant, all too often I find myself on an engagement where the customer just cares about having me "check the box" on his scorecard and cares very little (or at all) about the overall system security. I'm not just speaking in hyperbole -- on my most recent engagement I actually had a customer say -- in response to removing unnecessary TCP services from their Windows boxen -- well, if it isn't talked about in "x" regulation we don't want you to fix it. [sigh]

Francois said...

Very good post. Regulatory compliance (e.g. antivirus controls) does not necessarily equal good security. Steven - if your customers are required to exercise prudence and due care, it could be argued that they would not be compliant by only following the bare minimum "letter of the law."

marklar said...

Testing the control is the important bit, if it's doing it's job of reducing a risk and you can quantify it, it's a good control; if it's there because of policy and it provides no practical benefit, it's a bad control and should be removed. Why pay the cost of the control (licensing, administrative, one more thing to go wrong in a crisis) if it provides little to no benefit?

Aservire said...

I think that ignoring controls in favor of testing outputs just shifts the "completeness" discussion from "how complete are my controls" to "how complete are my tests". If you are not as clever as every high-powered hacker out there what makes you think you can challenge every possible vulnerability? If you throw something out as a honeypot, you must deal with the variable of who tried to hack it? If the most clever didn't hack it or suspected it was what it truly was (a fake system) then your output check is not effective. What's more, you have no way of knowing this but must just conclude if it is not compromised that its security is good enough. The thing I like about FISMA (and other control systems like it) is that they give you a place to start. True it is foolish to stop and think you are done after a FISMA C&A but if you don't know that already you have no business in IT security.

- Aservire

Doug Stewart said...
This comment has been removed by a blog administrator.
dre said...

I think controls are great, but I also think testing is better. Everyone can implement a little of both, but as you spend more in testing - you're more likely to also be spending less (percentage-wise) on controls (as well as probably more on incident response).

Let's example the problem with anti-virus signatures. Dan Geer recently gave a talk, "A Quant Looks at the Future, Extrapolation via Trend Analysis". On slide 103, he lists three "Losing Propositions": content inspection, statistical anomaly detection, and signature finding.

If you haven't seen the Blackhat 2006 work from Matasano on Do Enterprise management agents dream of electric sheep? or Richard's follow up post, then I suggest starting there. AV is "not cost-free", as Richard states.

So when we implement controls, they need to be good controls. To move to safety in automobiles as an example of software security assurance: we need to provide controls on vehicles and highways. Drivers' licenses are a form of a safety control as well. There is no doubt that testing (checking the oil and tires of your vehicle before a long road-trip) is absolutely necessary for proper safety. However, the little things you learn in driver's ed are probably more viable day-to-day - things like wearing your seat belt, checking the rear-view, using your signals, etc.

Anti-virus signatures are the equivalent of throwing a bottle at the ship to see if it breaks (to quote your vehicle safety example). I wonder how many people think that anti-virus is needed under Vista or Linux?

In my opinion, anti-virus isn't needed for Vista or Linux/grsecurity because of other OS application security features that can be utilized. I have compared ASLR to airbags before, but indeed this is a good analogy. I would rather run Windows XPSP2 (with /NoExecute=OptIn on an XD-bit enabled Intel chip) and use Firefox with DieHard and NoScript/LocalRodeo protections (with Java off and Flash not even installed) than to rely on an AV to save me from browser-based malware.

Can I put these on a more mature checklist (if ASLR then NOT AV else AV)? I sure can. Do I need to test to know that this way of doing things is more secure? Yes, but this should be optional - not required. The point is that the checklist is required. The first step when auditing (or assessing) a system, application, or software architecture is to do a secure design inspection. The first step when auditing (or assessing) a network is to perform a secure architecture inspection.

Many penetration/security testing companies call these design review and architecture review. Do they have checklists for these? Of course. How about secure code review? Yes, even MITRE CWE is a checklist of sorts.

Your twelve-step program listed above is also an example of a checklist that can be easily turned into audit controls. I think the point you are trying to make it that most audit standards and certification criteria do not include these types of testing with the obvious "output" reporting to measure/trend the results. However, I don't see why they can't include this.

I've been working on a certification framework for web applications. Many of the controls in this document are based on an assurance level that begins with secure design inspection and moves higher by adding automated penetration testing or automated code review, further higher by performing manual penetration tests, and finally the highest level is by performing manual code review. Some controls are not based on testing or code review, but may have certain conditions that require to be met (e.g. FIPS 140-2), while others such as security monitoring/logging can only be performed with design inspection or code review.

Some systems, such as Common Criteria, or A1-levelTCSEC (Orange Book) also begin with design inspection. However, in these cases, formal system top-level specifications are written using formal logic along with a formal security policy model. In the case of A1 systems - this formal specification is also verified using formal methods. No testing of the system has been done - yet the Orange Book offers a higher level of assurance.

The Orange Book also specifies functionality (think back to ASLR), such as protections against object reuse, covert channels, and access control bypass. The concepts of a secure kernel, trusted path, and TCB are presented as "functionality" also required.

Going back to automobiles, the same is often true. We have functionality (seat-belts, airbags, harnesses, helmets) - some of which are optional; some required. We also have assurance (NHTSA's NCAP program, which provides a five-star safety rating system based on the assumed injuries of crash test dummies during impact).

Functionality can be day-to-day verified with checklists (for the masses). AV has been a classic example of something provided on a checklist to a new user when they purchase a new computer. It's been the Occam's razor for application security, especially for a newbie system administrator or tech support professional. This will change in time, as anyone with advanced knowledge of security probably already knows. Most security professionals today are focused on data leakage prevention (DLP), which turns a focus away from systems/networks as assets to instead see data as assets.

Assurance is something that has to be built into software. Formal specification and verification work well to this end, but many developers are stuck with what their framework provides them. This is where penetration testing became an important focus area in the information security management chain. However, I dislike penetration testing and feel that it should be replaced by developer-testing early in the life cycle, which would improve assurance even more-so. While I prefer formal methods/logic, there are certainly ways of increasing assurance to software (or data) that relies on both informal checklists and informal testing.

Visibility, honeypots, honeynets, sinkholes, rootkit/malware checkers, and incident response tools are also levels of functionality, not assurance. These are functionality just like firewalls and Enterprise AV agents. They are used by information security management to defend assets such as systems, networks, and data. Combined with measurements and trending (as your twelve steps explain) can provide a high level of assurance to the process, but not the technology. Bruce Potter has a "Pyramid of IT Security Needs" on slide 7 in his 2005 LayerOne presentation, Security in the Development Process, which is a take on Maslow's hierarchy of needs (which reminds me of Mike Murray's presentation on Building a Sustainable Security Career). The hierarchy puts Honeypots at the top, with IDS below, and other common IT security functionality even lower.

I'm not sure that I agree with Bruce Potter that Honeypots and IDS are more complicated and cost more than firewalls, anti-virus, and patch management. There have been times when running Argus on a FreeBSD machine attached to a SPAN port connected on the inside of a firewall has provided me with several IP/MAC addresses of machines that were clearly running worms or viruses of some kind. All of this data made available with very little effort using visibility tools, and probably more accurate than an Enterprise AV system. However, this doesn't provide assurance or strong security.

I almost see this blog entry's focus on AV as a "staw man argument" against controls. Why can't controls take into account both functionality and assurance? Why can't controls provide output from both security testing and IT security architecture functionality measurements?

My current idea for providing software security assurance for the masses is to implement a five-star rating system that focuses on the percentage of CWE's covered in code by software security testing. The testing must be implemented by at least three external companies that perform extensive review based on industry-standard criteria. The assurance criteria will need to review critical components by using code coverage and cyclomatic complexity metrics (e.g. Crap4J) and perform manual code review along with all types of CVE-Compatible automated tools/services (e.g. binary analysis, fault-injection, fuzz testing, and static code analysis). I see correlations between the FSN, NCAP, and the ESRB and this sort of process.

I don't want to go as far as you did and say that all of the following is useless: ISM functionality (e.g. firewalls/UTM, extrusion detection/prevention, visibility, NSM, IPS devices, honeynets, etc), OS exploitation countermeasure functionality (i.e. auto-update, ASLR, software firewalls), and third-party OS/data exploitation countermeasures (AV agents, software firewalls, DLP agents, patch-management agents, HIPS, NAC "endpoint" agents, et al). I usually don't have a lot of nice things to say about the above, but the truth is that they might be half of of the answer to the problem, especially if properly designed and implemented (and shown by using measurements to improve "your situation").

The real solution to our problem is to acquire software security assurance metrics (which could be implemented by using controls), such as the one I proposed. Both functionality and assurance are needed, but assurance is often/always left out of the picture. You may think that using a seat-belt (i.e. AV agent) may save your life, but the reality is that car manufacturers and traffic highway administrators/implementors (as well as paved streets, road-signs, traffic lights, et al) have done quite a lot to improve your safe driving experience.

Anonymous said...

I see too many people here defending the controls, and I understand that, controls are important. But really the problem is not the controls, it is the surrounding environment of �the controls are everything�, and if the controls are good, more is better. And more so the problem is the changing of the controls from a checklist to help you to understand where you are (within the context of other metrics) to the environment where the controls are the scorecard. As a scorecard, controls stink. Worse is the attitude that if controls are good, more controls are better, thus SP 800-26 to SP 800-53 to SP 800-53A. In each case the controls are mo-better. I look at the SSPs in my workplace, and the old ones under SP 800-26 are generally OK. The new improved ones using a new improved automated system tracking SP 800-53 compliance are not worth the paper they are printed on (fortunately they aren�t printed but kept electronically or it really would be fraud, waste, and abuse.) But, they are bigger�and better�so much better that they have transcended the requirement to improve security. They are in and of themselves a good thing. And because of them, we are secure. Remember the Department of Veterans Affairs had a satisfactory computer security program in 2005. They faired less well in 2006 but you can�t figure out why from the FISMA report. Every response is �Yes� or �Almost Always�. It really looks like they are doing OK. Except for the minor problems of millions of social security numbers lost. Obviously we need a new control SSN-01 � Agency assigns personnel to protect social security numbers and scapegoats to fire when said social security numbers are lost. Control Enhancement 1: Agency scapegoat is in grade of GS-15 or above but not the Chief Information Officer (High systems only).

rybolov said...

Hi All

I'm literally months later in posting this.

A catalog of controls is a summary of all the laws and directives that you have, along with a typically genericized set of best practices.

The catalog of controls (and compliance in general) is not designed to be a catch-all, so people need to quit treating it like it is. At BEST, it's a 75% solution: much better than 0% but no way near where it needs to be.