Controls Are Not the Solution to Our Problem
If you recognize the inspiration for this post title and graphic, you'll understand my ultimate goal. If not, let me start by saying this post is an expansion of ideas presented in a previous post with the succinct and catchy title Control-Compliant vs Field-Assessed Security.
In brief, too many organizations, regulators, and government agencies waste precious time and resources devising and auditing "controls," regardless of the effect these controls have or do not have on security. They are far too input-centric; they should become more output-aware. They obsess over recording conditions they believe may be helpful while remaining ignorant of the "score of the game." They practice management by belief and disregard management by fact.
Let me provide a few examples from one of the canonical texts used by the control-compliant crowd: NIST Special Publication 800-53: Recommended Security Controls for Federal Information Systems (.pdf). The following is an example of a control, taken from page 140.
SI-3 MALICIOUS CODE PROTECTION
The information system implements malicious code protection.
Control: Supplemental Guidance: The organization employs malicious code protection mechanisms at critical information system entry and exit points (e.g., firewalls, electronic mail servers, web servers, proxy servers, remote-access servers) and at workstations, servers, or mobile computing devices on the network. The organization uses the malicious code protection mechanisms to detect and eradicate malicious code (e.g., viruses, worms, Trojan horses, spyware) transported: (i) by electronic mail, electronic mail attachments, Internet accesses, removable media (e.g., USB devices, diskettes or compact disks), or other common means; or (ii) by exploiting information system vulnerabilities. The organization updates malicious code protection mechanisms (including the latest virus definitions) whenever new releases are available in accordance with organizational configuration management policy and procedures. The organization considers using malicious code protection software products from multiple vendors (e.g., using one vendor for boundary devices and servers and another vendor for workstations). The organization also considers the receipt of false positives during malicious code detection and eradication and the resulting potential impact on the availability of the information system. NIST Special Publication 800-83 provides guidance on implementing malicious code protection.
Control Enhancements:
(1) The organization centrally manages malicious code protection mechanisms.
(2) The information system automatically updates malicious code protection mechanisms.
At first read one might reasonably respond by saying "What's wrong with that? This control advocates implementing anti-virus and related anti-malware software." Think more clearly about this issue and several problems appear.
The purpose of this post is to tentatively propose an alternative approach. I called this "field-assessed" in contrast to "control-compliant." Some people prefer the term "results-based." Whatever you call it, the idea is to direct attention away from inputs and devote more energy to outputs. As far as mandating inputs (like every device must run anti-virus), I say that is a waste of time and resources.
I recommend taking measurements to determine your enterprise "score of the game," and use that information to decide what you need to do differently. I'm not suggesting abandoning efforts to prevent intrusions (i.e., "inputs.") Rather, don't think your security responsibilities end when the bottle is broken against the bow of the ship and it slides into the sea. You've got to keep watching to see if it sinks, if pirates attack, how the lifeboats handle rough seas, and so forth.
These are a few ideas.
In all of these cases, trend your measurements over time to see if you see improvements when you alter an input. I know some of you might complain that you can't expect to have consistent output when the threat landscape is constantly changing. I really don't care, and neither does your CEO or manager!
I offer two recommendations:
Controls are not the solution to our problem. Controls are the problem. They divert too much time, resources, and attention from endeavors which do make a difference. If the indications I am receiving from readers and friends are true, the ideas in this post are gaining traction. Do you have other ideas?
In brief, too many organizations, regulators, and government agencies waste precious time and resources devising and auditing "controls," regardless of the effect these controls have or do not have on security. They are far too input-centric; they should become more output-aware. They obsess over recording conditions they believe may be helpful while remaining ignorant of the "score of the game." They practice management by belief and disregard management by fact.
Let me provide a few examples from one of the canonical texts used by the control-compliant crowd: NIST Special Publication 800-53: Recommended Security Controls for Federal Information Systems (.pdf). The following is an example of a control, taken from page 140.
SI-3 MALICIOUS CODE PROTECTION
The information system implements malicious code protection.
Control: Supplemental Guidance: The organization employs malicious code protection mechanisms at critical information system entry and exit points (e.g., firewalls, electronic mail servers, web servers, proxy servers, remote-access servers) and at workstations, servers, or mobile computing devices on the network. The organization uses the malicious code protection mechanisms to detect and eradicate malicious code (e.g., viruses, worms, Trojan horses, spyware) transported: (i) by electronic mail, electronic mail attachments, Internet accesses, removable media (e.g., USB devices, diskettes or compact disks), or other common means; or (ii) by exploiting information system vulnerabilities. The organization updates malicious code protection mechanisms (including the latest virus definitions) whenever new releases are available in accordance with organizational configuration management policy and procedures. The organization considers using malicious code protection software products from multiple vendors (e.g., using one vendor for boundary devices and servers and another vendor for workstations). The organization also considers the receipt of false positives during malicious code detection and eradication and the resulting potential impact on the availability of the information system. NIST Special Publication 800-83 provides guidance on implementing malicious code protection.
Control Enhancements:
(1) The organization centrally manages malicious code protection mechanisms.
(2) The information system automatically updates malicious code protection mechanisms.
At first read one might reasonably respond by saying "What's wrong with that? This control advocates implementing anti-virus and related anti-malware software." Think more clearly about this issue and several problems appear.
- Adding anti-virus products can introduce additional vulnerabilities to systems which might not have exposed themselves without running anti-virus. Consider my post Example of Security Product Introducing Vulnerabilities if you need examples. In short, add anti-virus, be compromised.
- Achieving compliance may cost more than potential damage. How many times have you heard a Unix administrator complain that he/she has to purchase an anti-virus product for his/her Unix server simply to be compliant with a control like this? The potential for a Unix server (not Mac OS X) to be damaged by a user opening an email through a client while logged on to the server (a very popular exploitation vector on a Windows XP box) is practically nil.
- Does this actually work? This is the question that no one asks. Does it really matter if your system is running anti-virus software? Did you know that intruders (especially high-end ones most likely to selectively, steathily target the very .gov and .mil systems required to be compliant with this control) test their malware against a battery of anti-virus products to ensure their code wins? Are weekly updates superior to daily updates? Daily to hourly?
The purpose of this post is to tentatively propose an alternative approach. I called this "field-assessed" in contrast to "control-compliant." Some people prefer the term "results-based." Whatever you call it, the idea is to direct attention away from inputs and devote more energy to outputs. As far as mandating inputs (like every device must run anti-virus), I say that is a waste of time and resources.
I recommend taking measurements to determine your enterprise "score of the game," and use that information to decide what you need to do differently. I'm not suggesting abandoning efforts to prevent intrusions (i.e., "inputs.") Rather, don't think your security responsibilities end when the bottle is broken against the bow of the ship and it slides into the sea. You've got to keep watching to see if it sinks, if pirates attack, how the lifeboats handle rough seas, and so forth.
These are a few ideas.
- Standard client build client-side survival test. Create multiple sacrificial systems with your standard build. Deploy a client-side testing solution on them, like a honeyclient. (See The Sting for a recent story.) Vary your defensive posture. Measure how long it takes for your standard build to be compromised by in-the-wild Web sites, spam, and other communications with the outside world.
- Standard client build server-side survival test. Create multiple sacrificial systems with your standard build. Deploy them as a honeynet. Vary your defensive posture. Measure how long it takes for your standard build to be compromised by malicious external traffic from the outside world -- or better yet -- from your internal network.
- Standard client build client-side penetration test. Create multiple sacrificial systems with your standard build. Conduct my recommendation penetration testing activities and time the result.
- Standard client build server-side penetration test. Repeat number 3 with a server-side flavor.
- Standard server build server-side penetration test. Repeat number 3 against your server build with a server-side flavor. I hope you don't have users operating servers as if they were clients (i.e., browsing the Web, reading email, and so forth.) If you do, repeat this step and do a client-side pen test too.
- Deploy low-interactive honeynets and sinkhole routers in your internal network. These low-interaction systems provide a means to get some indications of what might be happening inside your network. If you think deploying these on the external network might reveal indications of targeted attacks, try that. (I doubt it will be that useful due to the overall attack noise, but who knows?)
- Conduct automated, sampled client host integrity assessments. Select a statistically valid subset of your clients and check them using multiple automated tools (malware/rootkit/etc. checkers) for indications of compromise.
- Conduct automated, sampled server host integrity assessments. Self-explanatory.
- Conduct manual, sampled client host integrity assessments. These are deep-dives of individual systems. You can think of it as an incident response where you have not had indication of an incident yet. Remote IR tools can be helpful here. If you are really hard-core and you have the time, resources, and cooperation, do offline analysis of the hard drive.
- Conduct manual, sampled server host integrity assessments. Self-explanatory.
- Conduct automated, sampled network host activity assessments. I questioned adding this step here, since you should probably always be doing this. Sometimes it can be difficult to find the time to review the results, however automated the data collection. The idea is to let your NSM system see if any of the traffic it sees is out of the ordinary based on algorithms you provide.
- Conduct manual, sampled network host activity assessments. This method is more likely to produce results. Here a skilled analyst performs deep individual analysis of traffic on a sample of machines (client and server, separately) to see if any indications of compromise appear.
In all of these cases, trend your measurements over time to see if you see improvements when you alter an input. I know some of you might complain that you can't expect to have consistent output when the threat landscape is constantly changing. I really don't care, and neither does your CEO or manager!
I offer two recommendations:
- Remember Andy Jaquith's criteria for good metrics, simplified here.
- Measure consistently.
- Make them cheap to measure. (Sorry Andy, my manual tests violate this!)
- Use compound metrics.
- Be actionable.
- Don't slip into thinking of inputs. Don't measure how many hosts are running anti-virus. We want to measure outputs. We are not proposing new controls.
Controls are not the solution to our problem. Controls are the problem. They divert too much time, resources, and attention from endeavors which do make a difference. If the indications I am receiving from readers and friends are true, the ideas in this post are gaining traction. Do you have other ideas?
Comments
Great post but I would like to differ from the view that “controls are not the solution”. Controls are part of the solution but not the only solution. I am of the opinion that field assessments when performed accurately and timely will to a greater extent determine control effectiveness. The outputs you refer to, if I understand your argument, I consider also as an indication of whether the implemented “controls” provide a positive or negative measure of security if an assessment is carried out and not necessarily reliant on the fact that the controls are in place as required by numerous standards.
I agree with you that “control-compliant” focus is management by belief as there are no means to determine that these so-called controls are effective.
WM
Frank
- First, NIST does not use a risk approach but IMPACT. You do not start with a risk assessment of your environment, you start with an IMPACT assessment (FIPS-199 and FIPS-200).
- Second, my concern is the NIST approach is its totally control focused. Thee is little discussion about processes, checks, learning, etc. These may be listed as a control, but it appears more of an after thought.
- Third, to be honest I find the NIST documentation to be confusing. The new SP800-39 document helps to give a better understanding of the overview, but once again NIST takes a very specific, targeted view of controls, it appears not to focus on the big picture. Its almost as if NIST does not trust the security professional and wants to hold their hand.
- One thing I do truly appreciate about NIST (and I've seen this mentioned around the world) is the technical SP800 documents, not so much on the control approach but details on specific controls (such as IDS, forensics, incidents response, etc).
lance
In a civilian agency I can point to the controls and use that as justification to have anti-virus on windows systems and *nix file servers.
With more resources at my disposal I would be interested in following the procedure you describe.
1. Selecting controls that are effective and that make sense.
2. Monitoring, detecting, and reacting to controls.
3. Testing the effectiveness of the controls with periodic (and perhaps random) field assessments / "survival, penetration, integrity tests".
- Aservire
Let's example the problem with anti-virus signatures. Dan Geer recently gave a talk, "A Quant Looks at the Future, Extrapolation via Trend Analysis". On slide 103, he lists three "Losing Propositions": content inspection, statistical anomaly detection, and signature finding.
If you haven't seen the Blackhat 2006 work from Matasano on Do Enterprise management agents dream of electric sheep? or Richard's follow up post, then I suggest starting there. AV is "not cost-free", as Richard states.
So when we implement controls, they need to be good controls. To move to safety in automobiles as an example of software security assurance: we need to provide controls on vehicles and highways. Drivers' licenses are a form of a safety control as well. There is no doubt that testing (checking the oil and tires of your vehicle before a long road-trip) is absolutely necessary for proper safety. However, the little things you learn in driver's ed are probably more viable day-to-day - things like wearing your seat belt, checking the rear-view, using your signals, etc.
Anti-virus signatures are the equivalent of throwing a bottle at the ship to see if it breaks (to quote your vehicle safety example). I wonder how many people think that anti-virus is needed under Vista or Linux?
In my opinion, anti-virus isn't needed for Vista or Linux/grsecurity because of other OS application security features that can be utilized. I have compared ASLR to airbags before, but indeed this is a good analogy. I would rather run Windows XPSP2 (with /NoExecute=OptIn on an XD-bit enabled Intel chip) and use Firefox with DieHard and NoScript/LocalRodeo protections (with Java off and Flash not even installed) than to rely on an AV to save me from browser-based malware.
Can I put these on a more mature checklist (if ASLR then NOT AV else AV)? I sure can. Do I need to test to know that this way of doing things is more secure? Yes, but this should be optional - not required. The point is that the checklist is required. The first step when auditing (or assessing) a system, application, or software architecture is to do a secure design inspection. The first step when auditing (or assessing) a network is to perform a secure architecture inspection.
Many penetration/security testing companies call these design review and architecture review. Do they have checklists for these? Of course. How about secure code review? Yes, even MITRE CWE is a checklist of sorts.
Your twelve-step program listed above is also an example of a checklist that can be easily turned into audit controls. I think the point you are trying to make it that most audit standards and certification criteria do not include these types of testing with the obvious "output" reporting to measure/trend the results. However, I don't see why they can't include this.
I've been working on a certification framework for web applications. Many of the controls in this document are based on an assurance level that begins with secure design inspection and moves higher by adding automated penetration testing or automated code review, further higher by performing manual penetration tests, and finally the highest level is by performing manual code review. Some controls are not based on testing or code review, but may have certain conditions that require to be met (e.g. FIPS 140-2), while others such as security monitoring/logging can only be performed with design inspection or code review.
Some systems, such as Common Criteria, or A1-levelTCSEC (Orange Book) also begin with design inspection. However, in these cases, formal system top-level specifications are written using formal logic along with a formal security policy model. In the case of A1 systems - this formal specification is also verified using formal methods. No testing of the system has been done - yet the Orange Book offers a higher level of assurance.
The Orange Book also specifies functionality (think back to ASLR), such as protections against object reuse, covert channels, and access control bypass. The concepts of a secure kernel, trusted path, and TCB are presented as "functionality" also required.
Going back to automobiles, the same is often true. We have functionality (seat-belts, airbags, harnesses, helmets) - some of which are optional; some required. We also have assurance (NHTSA's NCAP program, which provides a five-star safety rating system based on the assumed injuries of crash test dummies during impact).
Functionality can be day-to-day verified with checklists (for the masses). AV has been a classic example of something provided on a checklist to a new user when they purchase a new computer. It's been the Occam's razor for application security, especially for a newbie system administrator or tech support professional. This will change in time, as anyone with advanced knowledge of security probably already knows. Most security professionals today are focused on data leakage prevention (DLP), which turns a focus away from systems/networks as assets to instead see data as assets.
Assurance is something that has to be built into software. Formal specification and verification work well to this end, but many developers are stuck with what their framework provides them. This is where penetration testing became an important focus area in the information security management chain. However, I dislike penetration testing and feel that it should be replaced by developer-testing early in the life cycle, which would improve assurance even more-so. While I prefer formal methods/logic, there are certainly ways of increasing assurance to software (or data) that relies on both informal checklists and informal testing.
Visibility, honeypots, honeynets, sinkholes, rootkit/malware checkers, and incident response tools are also levels of functionality, not assurance. These are functionality just like firewalls and Enterprise AV agents. They are used by information security management to defend assets such as systems, networks, and data. Combined with measurements and trending (as your twelve steps explain) can provide a high level of assurance to the process, but not the technology. Bruce Potter has a "Pyramid of IT Security Needs" on slide 7 in his 2005 LayerOne presentation, Security in the Development Process, which is a take on Maslow's hierarchy of needs (which reminds me of Mike Murray's presentation on Building a Sustainable Security Career). The hierarchy puts Honeypots at the top, with IDS below, and other common IT security functionality even lower.
I'm not sure that I agree with Bruce Potter that Honeypots and IDS are more complicated and cost more than firewalls, anti-virus, and patch management. There have been times when running Argus on a FreeBSD machine attached to a SPAN port connected on the inside of a firewall has provided me with several IP/MAC addresses of machines that were clearly running worms or viruses of some kind. All of this data made available with very little effort using visibility tools, and probably more accurate than an Enterprise AV system. However, this doesn't provide assurance or strong security.
I almost see this blog entry's focus on AV as a "staw man argument" against controls. Why can't controls take into account both functionality and assurance? Why can't controls provide output from both security testing and IT security architecture functionality measurements?
My current idea for providing software security assurance for the masses is to implement a five-star rating system that focuses on the percentage of CWE's covered in code by software security testing. The testing must be implemented by at least three external companies that perform extensive review based on industry-standard criteria. The assurance criteria will need to review critical components by using code coverage and cyclomatic complexity metrics (e.g. Crap4J) and perform manual code review along with all types of CVE-Compatible automated tools/services (e.g. binary analysis, fault-injection, fuzz testing, and static code analysis). I see correlations between the FSN, NCAP, and the ESRB and this sort of process.
I don't want to go as far as you did and say that all of the following is useless: ISM functionality (e.g. firewalls/UTM, extrusion detection/prevention, visibility, NSM, IPS devices, honeynets, etc), OS exploitation countermeasure functionality (i.e. auto-update, ASLR, software firewalls), and third-party OS/data exploitation countermeasures (AV agents, software firewalls, DLP agents, patch-management agents, HIPS, NAC "endpoint" agents, et al). I usually don't have a lot of nice things to say about the above, but the truth is that they might be half of of the answer to the problem, especially if properly designed and implemented (and shown by using measurements to improve "your situation").
The real solution to our problem is to acquire software security assurance metrics (which could be implemented by using controls), such as the one I proposed. Both functionality and assurance are needed, but assurance is often/always left out of the picture. You may think that using a seat-belt (i.e. AV agent) may save your life, but the reality is that car manufacturers and traffic highway administrators/implementors (as well as paved streets, road-signs, traffic lights, et al) have done quite a lot to improve your safe driving experience.
I'm literally months later in posting this.
A catalog of controls is a summary of all the laws and directives that you have, along with a typically genericized set of best practices.
The catalog of controls (and compliance in general) is not designed to be a catch-all, so people need to quit treating it like it is. At BEST, it's a 75% solution: much better than 0% but no way near where it needs to be.