"Protect the Data" Idiot!
The 28 September 2009 issue of InformationWeek cited a comment posted to one of their forums. I'd like to cite an excerpt from that comment.
[W]e tend to forget the data is the most critical asset. yet we spend inordinate time and resources trying to protect the infrastructure, the perimeter... the servers etc. I believe and [sic] information-centric security approach of protecting the data itself is the only logical approach to keep it secure at rest, in motion and in use. (emphasis added)
I hear this "protect the data" argument all the time. I think it is one of the most misinformed comments that one can make. I think of Chris Farley smacking his head saying "IDIOT!" when I hear "protect the data."
"Oh right, that's what we should have been doing for the last 10, 20, 30 years -- protect the data! I feel so stupid to have not done that! IDIOT!"
"Protect the data" represents a nearly fatal understanding of information security. I'm tired of hearing it, so I'm going to dismantle the idea in this post.
Now that I've surely offended someone, here are my thoughts.
Someone show me "data." What is "data" anyway? Let's assume it takes electronic form, which is the focus of digital security measures. This is the first critical point:
Digital data does not exist independently of a container.
Think of the many containers which hold data. Imagine looking at a simple text file retrieved from a network share via NFS and viewed with a text editor.
That's eight unique categories of data containers. Some smart blog reader can probably contribute two others to round out the list at ten!
So where exactly do we "protect the data"? "In motion/transit, and at rest" are the popular answers. Good luck with that. Seriously. This leads to my second critical point:
If an authorized user can access data, so can an unauthorized user.
Think about it. Any possible countermeasure you can imagine can be defeated by a sufficiently motivated and resourced adversary. One example: "Solution:" Encrypt everything! Attack: Great, wait until an authorized user views a sensitive document, and then screen-scrape every page using the malware installed last week.
If you doubt me, consider the "final solution" that defeats any security mechanism:
Become an authorized user, e.g., plant a mole/spy/agent. If you think you can limit what he or she can remove from a "secure" site, plant an agent with a photographic memory. This is an extreme example but the point is that there is no "IDIOT" solution out there.
I can make rational arguments for a variety of security approaches, from defending the network, to defending the platform, to defending the operating system, to defending the application, and so on. At the end of the day, don't think that wrapping a document in some kind of rights management system or crypto is where "security" should be heading. I don't disagree that adding another level of protection can be helpful, but it's not like intruders are going to react by saying "Shucks, we're beat! Time to find another job."
Intruders who encounter so-called "protect the data" approaches are going to break them like every other countermeasure deployed so far. It's just a question of how expensive it is for the intruder to do so. Attackers balance effort against "return" like any other rational actor, and they will likely find cheap ways to evade "protect the data" approaches.
Only when relying on human agents is the cheapest way to steal data, or when it's cheaper to research develop one's own data, will digital security be able to declare "victory." I don't see that happening soon; no one in history has ever found a way to defeat crime, espionage, or any of the true names for the so-called "information security" challenges we face.
[W]e tend to forget the data is the most critical asset. yet we spend inordinate time and resources trying to protect the infrastructure, the perimeter... the servers etc. I believe and [sic] information-centric security approach of protecting the data itself is the only logical approach to keep it secure at rest, in motion and in use. (emphasis added)
I hear this "protect the data" argument all the time. I think it is one of the most misinformed comments that one can make. I think of Chris Farley smacking his head saying "IDIOT!" when I hear "protect the data."
"Oh right, that's what we should have been doing for the last 10, 20, 30 years -- protect the data! I feel so stupid to have not done that! IDIOT!"
"Protect the data" represents a nearly fatal understanding of information security. I'm tired of hearing it, so I'm going to dismantle the idea in this post.
Now that I've surely offended someone, here are my thoughts.
Someone show me "data." What is "data" anyway? Let's assume it takes electronic form, which is the focus of digital security measures. This is the first critical point:
Digital data does not exist independently of a container.
Think of the many containers which hold data. Imagine looking at a simple text file retrieved from a network share via NFS and viewed with a text editor.
- Data exists as an image rendered on a screen attached to the NFS client.
- Data exists as a temporary file on the hard drive of the NFS client, and as a file on the hard drive of the NFS server.
- Data exists in memory on the NFS client, and in memory on the NFS server.
- The NFS client and server are computers sitting in facilities.
- Network infrastructure carries data between the NFS client and server.
- Data exists as network traffic exchanged between the NFS client and server.
- If the user prints the file, it is now contained on paper (in addition to involving a printer with its own memory, hard drive, etc.)
- The electromagnetic spectrum is a container for data as it is transmitted by the screen, carried by network cables and/or wireless media, and so on.
That's eight unique categories of data containers. Some smart blog reader can probably contribute two others to round out the list at ten!
So where exactly do we "protect the data"? "In motion/transit, and at rest" are the popular answers. Good luck with that. Seriously. This leads to my second critical point:
If an authorized user can access data, so can an unauthorized user.
Think about it. Any possible countermeasure you can imagine can be defeated by a sufficiently motivated and resourced adversary. One example: "Solution:" Encrypt everything! Attack: Great, wait until an authorized user views a sensitive document, and then screen-scrape every page using the malware installed last week.
If you doubt me, consider the "final solution" that defeats any security mechanism:
Become an authorized user, e.g., plant a mole/spy/agent. If you think you can limit what he or she can remove from a "secure" site, plant an agent with a photographic memory. This is an extreme example but the point is that there is no "IDIOT" solution out there.
I can make rational arguments for a variety of security approaches, from defending the network, to defending the platform, to defending the operating system, to defending the application, and so on. At the end of the day, don't think that wrapping a document in some kind of rights management system or crypto is where "security" should be heading. I don't disagree that adding another level of protection can be helpful, but it's not like intruders are going to react by saying "Shucks, we're beat! Time to find another job."
Intruders who encounter so-called "protect the data" approaches are going to break them like every other countermeasure deployed so far. It's just a question of how expensive it is for the intruder to do so. Attackers balance effort against "return" like any other rational actor, and they will likely find cheap ways to evade "protect the data" approaches.
Only when relying on human agents is the cheapest way to steal data, or when it's cheaper to research develop one's own data, will digital security be able to declare "victory." I don't see that happening soon; no one in history has ever found a way to defeat crime, espionage, or any of the true names for the so-called "information security" challenges we face.
Comments
All of these examples just highlight how difficult it is to "protect the data". The bad guys go for weakest links as you've stated above. They are now targeting government and business bank accounts because those deposits aren't as protected as retail bank accounts are since banks aren't liable for the losses like they are for credit cards and retail accounts. If the banks' security is too tough, they target the customer systems or the transaction firm's systems. This is why "data loss prevention" is almost an oxymoron. You can do everything right, and still lose your data because a firm or person you entrusted it to didn't do their security right.
This interdependence due to our network infrastructure is all around us.
It's almost the same situation as the banking system. An individual bank may have been sound, but each bank is dependent upon its counterparties to pay back their loans. If a competitor fails like Lehman, the system starts failing when AIG can't make good on its counterparty insurance, so a cascade of failure ripples through the system. The same is true for data security. The data is only as secure as the weakest link in the chain, be that chain the company infrastructure, a company employee, or another business that data is entrusted with.
Four other containers:
9. Backup media sent to an offsite storage facility.
10. Transferring the data from the client to a free email account.
11. Transferring the data from the free email account to an unsecured home system.
12. Transferring the data to a corporate email, phone, or fax and sending that data to a vendor or business partner. If it goes by phone or fax, you aren't likely monitoring that traffic by law.
Seats belts, air bags, anti-lock brakes, traction control, crumple zones, guard rails, and the road surface are all designed to minimize risks to the valuable contents of the cars. None of it is designed to protect itself.
I think this is another false argument. Please explain to me how deploying a firewall means "protecting the infrastructure itself." Someone deploys a firewall to protect the firewall?
Regardless, what is the container of data? Well it includes largely custom applications. These non-comoditized dynamic layers are a world apart from network security. We need to start working with developers to encourage them to build defensive controls within the app itself. And that croud, the developer crowd, traditionally are not a part of the infosec departments realm of responsibly. And this needs to change if we defenders want to actually win.
So, I tend to disagree with the authors article since it implies that netsec and WAF like thinking will win the game.
But if your web application coders are not doing input validation, output encoding, query parameterization, HTTPS, good authentication and solid activity based access control, CSRF tokens, security centric logging within the app, etc - you might be making data theft and manipulation a bit to easy for them. And even if your container is fully patched, you're screwed.
" All I'm saying is that fancy network gear, infrastructure and "container" protection is useless in the face of XSS, SQli, CSRF, and other OWASP Top Ten risks..."
I don't think you are understanding 'Container' the way Richard is using it--
"Container" protection for OWASP Top Then would be such things as you are saying-- (input valdiation, https, etc)
-Josh
I wrote in my 2003 blog post The Dynamic Duo Discuss Digital Risk that
Security is the process of maintaining an acceptable level of perceived risk.
I repeated it in my Tao book published in 2004. So, we can agree on that as well!
The word "perceived" throws me off a little, but I think I'll agree to.. agree with you! :)
Compelling blog post, it definitely got my attention.
Looking forward to your next book.
- Jim
The end game in most scenarios is data. "They" want your data. So I think the statement that "protecting data is what matters" is actually accurate. However, most people that repeat that phrase don't actually understand why or look at the problem too narrowly.
The problem is you believe its a one or the other approach, when its actually a tree or hierarchy. Protecting your data is the over arching goal and the sub components of achieving that goal include:
Infrastructure Security
Application Security
Data Encryption
Transport Encryption
Access Controls
NSM
etc etc etc
Each one of these contributes to protecting your data. One is not better then the other. The NSM centric approach is not better then the Data centric approach. It's been proved time and time again, even if you excel in one area of security its always the weakest link that gets ya. The best security is a comprehensive approach. However, each organization should consider their unique threats and focus on security controls that reduce their risk the most. It's not a one size fits all thing.
For _far_ too long, much of the discussion around InfoSec risk management has been dominated by the NSM crowd. Information Centricity is no cure all, but if traditional container protections really worked; why are breach rates high and rising?
@Richard Bejtlich
I welcome this debate. Glad you started it. Are you aware of the numerous anecdotes where information centric approaches have made the crucial difference? You can't possibly be claiming that NSM alone will do the job, just like we are not claiming that ICRM alone will do it.
Having said that, there is *ample* evidence from hundreds of enterprises where ICRM (Information Centric Risk Management) is a necessary component of the defense.
The key claim of the ICRM crowd (me included): "Effective infosec risk management is not possible if you don't know where the concentrations of your most sensitive data reside or flow."
The landscape of possible exposure risks is so large, prioritization is mandatory. If you are prioritizing with ignorance of location and flow of sensitive data, how can you possibly be doing a good job of defense?
Most NSM practitioners have a very remote understanding of these crucial facts of the enterprises they attempt to defend. Closing that ignorance gap is a more reasonable interpretation of what is meant by "Protect the Data".
Kevin Rowney
Founder, Vontu (now part of Symantec)
Your post attacks the ICRM point-of-view and explicitly promotes classic countermeasures like defending the network, the platform, the o.s., and the apps.
The same thing I said about NSM practitioners (they have a limited understanding of where the most crucial data is and where it flows) is equally applicable to practitioners of these other countermeasures.
So ok we agree: you are not saying NSM is "the answer". Where we differ: your dismissal of the ICRM point-of-view.
Security practitioners need to get a grip of this fact: many enterprises now use information centric techniques as a primary means of prioritization of the management of risk.
Dismissing "Protect the Data" as mere ignorant sloganeering isn't doing justice to facts on the ground. Many teams with good network/o.s./app security protections are missing cases of breach on a regular basis.
Cases of breach that would've been nailed by Information Centric approaches.
Kevin Rowney
http://www.greebo.net/2009/10/12/protect-the-data-idiot-redux/
Chris Farley interviewing Paul McCartney:
Chris Farley: Um, hi. Welcome to The Chris Farley Show. I'm.. Chris Farley.. and, my guest tonight is.. one of the.. greatest musicians.. uh, rock musicians. I guess, songwriter, ever. [ Smacks himself ] GOD! That sounds stupid! God, I'm an idiot!
So I wrote:
I think of Chris Farley smacking his head saying "IDIOT!" when I hear "protect the data."
"Oh right, that's what we should have been doing for the last 10, 20, 30 years -- protect the data! I feel so stupid to have not done that! IDIOT!"
In other words, supposedly "protect the data" is the silver bullet we should have been pursuing for the last several decades, and only now we're realizing it! Cue Farley: "Idiot!"
Myth #2 -- The standard model of perimeter security protects the enterprise...
When we look over the large number of data breach events that our customers prevented using DLP and when we see publicly reported breach events that clearly would have been stopped by our software had we been there - it's hard not to conclude that a central failing of the standard model is that it does not protect the data itself. The standard model does a fine job of protecting the containers of confidential data (firewalls protect the LAN, endpoint protection protects the hosts, access control protects the apps and files) but the data itself is left completely unprotected by these countermeasures.
Please tell me what is "the data itself" is. This might help focus the discussion. Thank you.
You're 100% right - I've never seen SNL or the great Chris Farley performing that skit. Even though I was in the USA, and SNL has been broadcast here in Australia on cable for a few years.
However, it still comes across with a real lack of irony and a very strong case FOR protecting containers.
For the record, I'm all for protecting containers, but only those containers that I've identified as holding or passing suitably classified data that we care about.
At the end of the day, if someone is willing to leak information, we have a HR problem, not a technical problem. I think too many security professionals forgot that very simple weak link - humans.
Looking forward to more of your posts - always thought provoking.
thanks,
Andrew
I also firmly believe in what you said at the end, and I think too many people take stances/approaches that disregard this simple truth:
"...no one in history has ever found a way to defeat crime, espionage, or any of the true names for the so-called "information security" challenges we face."