"Protect the Data" Idiot!

The 28 September 2009 issue of InformationWeek cited a comment posted to one of their forums. I'd like to cite an excerpt from that comment.

[W]e tend to forget the data is the most critical asset. yet we spend inordinate time and resources trying to protect the infrastructure, the perimeter... the servers etc. I believe and [sic] information-centric security approach of protecting the data itself is the only logical approach to keep it secure at rest, in motion and in use. (emphasis added)

I hear this "protect the data" argument all the time. I think it is one of the most misinformed comments that one can make. I think of Chris Farley smacking his head saying "IDIOT!" when I hear "protect the data."

"Oh right, that's what we should have been doing for the last 10, 20, 30 years -- protect the data! I feel so stupid to have not done that! IDIOT!"

"Protect the data" represents a nearly fatal understanding of information security. I'm tired of hearing it, so I'm going to dismantle the idea in this post.

Now that I've surely offended someone, here are my thoughts.

Someone show me "data." What is "data" anyway? Let's assume it takes electronic form, which is the focus of digital security measures. This is the first critical point:

Digital data does not exist independently of a container.

Think of the many containers which hold data. Imagine looking at a simple text file retrieved from a network share via NFS and viewed with a text editor.

  1. Data exists as an image rendered on a screen attached to the NFS client.

  2. Data exists as a temporary file on the hard drive of the NFS client, and as a file on the hard drive of the NFS server.

  3. Data exists in memory on the NFS client, and in memory on the NFS server.

  4. The NFS client and server are computers sitting in facilities.

  5. Network infrastructure carries data between the NFS client and server.

  6. Data exists as network traffic exchanged between the NFS client and server.

  7. If the user prints the file, it is now contained on paper (in addition to involving a printer with its own memory, hard drive, etc.)

  8. The electromagnetic spectrum is a container for data as it is transmitted by the screen, carried by network cables and/or wireless media, and so on.

That's eight unique categories of data containers. Some smart blog reader can probably contribute two others to round out the list at ten!

So where exactly do we "protect the data"? "In motion/transit, and at rest" are the popular answers. Good luck with that. Seriously. This leads to my second critical point:

If an authorized user can access data, so can an unauthorized user.

Think about it. Any possible countermeasure you can imagine can be defeated by a sufficiently motivated and resourced adversary. One example: "Solution:" Encrypt everything! Attack: Great, wait until an authorized user views a sensitive document, and then screen-scrape every page using the malware installed last week.

If you doubt me, consider the "final solution" that defeats any security mechanism:

Become an authorized user, e.g., plant a mole/spy/agent. If you think you can limit what he or she can remove from a "secure" site, plant an agent with a photographic memory. This is an extreme example but the point is that there is no "IDIOT" solution out there.

I can make rational arguments for a variety of security approaches, from defending the network, to defending the platform, to defending the operating system, to defending the application, and so on. At the end of the day, don't think that wrapping a document in some kind of rights management system or crypto is where "security" should be heading. I don't disagree that adding another level of protection can be helpful, but it's not like intruders are going to react by saying "Shucks, we're beat! Time to find another job."

Intruders who encounter so-called "protect the data" approaches are going to break them like every other countermeasure deployed so far. It's just a question of how expensive it is for the intruder to do so. Attackers balance effort against "return" like any other rational actor, and they will likely find cheap ways to evade "protect the data" approaches.

Only when relying on human agents is the cheapest way to steal data, or when it's cheaper to research develop one's own data, will digital security be able to declare "victory." I don't see that happening soon; no one in history has ever found a way to defeat crime, espionage, or any of the true names for the so-called "information security" challenges we face.


jbmoore said…

All of these examples just highlight how difficult it is to "protect the data". The bad guys go for weakest links as you've stated above. They are now targeting government and business bank accounts because those deposits aren't as protected as retail bank accounts are since banks aren't liable for the losses like they are for credit cards and retail accounts. If the banks' security is too tough, they target the customer systems or the transaction firm's systems. This is why "data loss prevention" is almost an oxymoron. You can do everything right, and still lose your data because a firm or person you entrusted it to didn't do their security right.

This interdependence due to our network infrastructure is all around us.
It's almost the same situation as the banking system. An individual bank may have been sound, but each bank is dependent upon its counterparties to pay back their loans. If a competitor fails like Lehman, the system starts failing when AIG can't make good on its counterparty insurance, so a cascade of failure ripples through the system. The same is true for data security. The data is only as secure as the weakest link in the chain, be that chain the company infrastructure, a company employee, or another business that data is entrusted with.

Four other containers:
9. Backup media sent to an offsite storage facility.
10. Transferring the data from the client to a free email account.
11. Transferring the data from the free email account to an unsecured home system.
12. Transferring the data to a corporate email, phone, or fax and sending that data to a vendor or business partner. If it goes by phone or fax, you aren't likely monitoring that traffic by law.
Ben said…
I'm rather disappointed by the short-sightedness represented here. Yes, you still need to secure the containers, but a data-centric approach has evolved for a couple key reasons, not the least of which being that organizations barely know what data they have, let alone where it is, or how it traverses their networks. You cannot focus on securing containers until you answer these basic questions about what data is most important and how it flows.

Beyond this, I think you also largely ignore the move to cloud computing. Whether it's SaaS, PaaS, or IaaS, as a consumer of the service you're forced to look at data protection rather than container protection, because that's what you can control.

More importantly, I think you have turned the argument around. You're saying "focus on the container that secures the data" and the rest of us are saying "focus on the data that needs to be secured so that you can then make the best decision possible for how to secure it." My phrasing sucks, but they're essentially inverse statements; start with the data, work to the container vs start with the container, work to the data.

Ben, I think you're missing my main point, maybe because I didn't say it outright: There is no such thing as "data." We can only think in terms of containers because without them, there is no "data." "Start with the data" means nothing, because data doesn't exist independently of a container. Cloud computing doesn't change anything. You're just defining another container. What are you supposed to control? It's whatever container holds the data.
James said…
I agree with Richard on this. After thinking about it for a bit, it seems to me that data is an abstract concept that can take on many forms. You cannot interact with data unless you interact with its container, you cannot touch or hold data unless you interact with its container. Data can only be in two states, in transit or at rest, both of which must take place within a container. Focusing on protecting data is a little like treating the symptoms but not fixing the underling issue that is causing the problem in the first place. For example, I have a hole in the bottom of my glass of water, but I’m focusing on the fact that I am losing water and trying to maintain an even level, instead of plugging the hole in the glass.
Khürt said…
I think what Ben is saying is that the approach taken to security so far has been to protect the infrastructure as though that was the valuable bit. If you take the approach that my "magic soda formula" is valuable then the infrastructure will be built to protect that. The approach I see today is more network crap on top of more/better access and authorization crap. Stop calling it information security if all the focus is on the infrastructure.

Seats belts, air bags, anti-lock brakes, traction control, crumple zones, guard rails, and the road surface are all designed to minimize risks to the valuable contents of the cars. None of it is designed to protect itself.

I think this is another false argument. Please explain to me how deploying a firewall means "protecting the infrastructure itself." Someone deploys a firewall to protect the firewall?
Jim Manico said…
This comment has been removed by the author.
Jim Manico said…
Traditional container/infrastructure based infosec has failed miserably as the attackers continue to target applications, the weakest link.

Regardless, what is the container of data? Well it includes largely custom applications. These non-comoditized dynamic layers are a world apart from network security. We need to start working with developers to encourage them to build defensive controls within the app itself. And that croud, the developer crowd, traditionally are not a part of the infosec departments realm of responsibly. And this needs to change if we defenders want to actually win.

So, I tend to disagree with the authors article since it implies that netsec and WAF like thinking will win the game.
Jim, I am not implying that "netsec and WAF like thinking will win the game." I love hearing from the appsec people on this subject. Please tell me what you are going to do differently to "protect the data" that is not being done already? Whatever it is, either myself or another reader will suggest multiple ways to defeat it. At its best, appsec makes it more difficult for intruders to accomplish their goal, but a sufficiently motivated and resourced intruder will still win.
Jim Manico said…
There is no such thing as security - only reducing risk to an acceptable level. All I'm saying is that fancy network gear, infrastructure and "container" protection is useless in the face of XSS, SQli, CSRF, and other OWASP Top Ten risks. You may just need to code your custom applications differently. And sadly, you are 100% correct. The attacker, especially a well resourced and motivated attacker, will always win in today's world.

But if your web application coders are not doing input validation, output encoding, query parameterization, HTTPS, good authentication and solid activity based access control, CSRF tokens, security centric logging within the app, etc - you might be making data theft and manipulation a bit to easy for them. And even if your container is fully patched, you're screwed.
Josh Brower said…

" All I'm saying is that fancy network gear, infrastructure and "container" protection is useless in the face of XSS, SQli, CSRF, and other OWASP Top Ten risks..."

I don't think you are understanding 'Container' the way Richard is using it--
"Container" protection for OWASP Top Then would be such things as you are saying-- (input valdiation, https, etc)

Hi again Jim,

I wrote in my 2003 blog post The Dynamic Duo Discuss Digital Risk that

Security is the process of maintaining an acceptable level of perceived risk.

I repeated it in my Tao book published in 2004. So, we can agree on that as well!
Jim Manico said…

The word "perceived" throws me off a little, but I think I'll agree to.. agree with you! :)

Compelling blog post, it definitely got my attention.

Looking forward to your next book.

- Jim
CG said…
I think there are a lot of valid points being made here, but I think the ideas are being framed wrong.

The end game in most scenarios is data. "They" want your data. So I think the statement that "protecting data is what matters" is actually accurate. However, most people that repeat that phrase don't actually understand why or look at the problem too narrowly.

The problem is you believe its a one or the other approach, when its actually a tree or hierarchy. Protecting your data is the over arching goal and the sub components of achieving that goal include:

Infrastructure Security
Application Security
Data Encryption
Transport Encryption
Access Controls
etc etc etc

Each one of these contributes to protecting your data. One is not better then the other. The NSM centric approach is not better then the Data centric approach. It's been proved time and time again, even if you excel in one area of security its always the weakest link that gets ya. The best security is a comprehensive approach. However, each organization should consider their unique threats and focus on security controls that reduce their risk the most. It's not a one size fits all thing.
kevin_rowney@symantec.com said…
@CG Amen! This debate about "Information Centricity" vs. NSM has been repeatedly setup as a straw man. No one debates that leaving the perimeter unguarded is a good idea, and too much of the negativity around Information Centricity uses that straw man thinking.

For _far_ too long, much of the discussion around InfoSec risk management has been dominated by the NSM crowd. Information Centricity is no cure all, but if traditional container protections really worked; why are breach rates high and rising?

@Richard Bejtlich
I welcome this debate. Glad you started it. Are you aware of the numerous anecdotes where information centric approaches have made the crucial difference? You can't possibly be claiming that NSM alone will do the job, just like we are not claiming that ICRM alone will do it.

Having said that, there is *ample* evidence from hundreds of enterprises where ICRM (Information Centric Risk Management) is a necessary component of the defense.

The key claim of the ICRM crowd (me included): "Effective infosec risk management is not possible if you don't know where the concentrations of your most sensitive data reside or flow."

The landscape of possible exposure risks is so large, prioritization is mandatory. If you are prioritizing with ignorance of location and flow of sensitive data, how can you possibly be doing a good job of defense?

Most NSM practitioners have a very remote understanding of these crucial facts of the enterprises they attempt to defend. Closing that ignorance gap is a more reasonable interpretation of what is meant by "Protect the Data".

Kevin Rowney
Founder, Vontu (now part of Symantec)
Wait a second, since when did I say in these posts that NSM is the answer?
kevin_rowney@symantec.com said…
@taosecurity RT "when did I say in these posts that NSM is the answer?"

Your post attacks the ICRM point-of-view and explicitly promotes classic countermeasures like defending the network, the platform, the o.s., and the apps.

The same thing I said about NSM practitioners (they have a limited understanding of where the most crucial data is and where it flows) is equally applicable to practitioners of these other countermeasures.

So ok we agree: you are not saying NSM is "the answer". Where we differ: your dismissal of the ICRM point-of-view.

Security practitioners need to get a grip of this fact: many enterprises now use information centric techniques as a primary means of prioritization of the management of risk.

Dismissing "Protect the Data" as mere ignorant sloganeering isn't doing justice to facts on the ground. Many teams with good network/o.s./app security protections are missing cases of breach on a regular basis.

Cases of breach that would've been nailed by Information Centric approaches.

Kevin Rowney
Jim Manico said…
Andrew van der Stock from the glorious world of OWASP and AppSec posted a reply to this blog post titled "Protect the Data, Idiot! Redux." It's worth a read.

I have plenty more to say about this, but let's address the "Idiot!" part of this post. I have a feeling people like Andrew van der Stock hasn't seen "The Chris Farley Show" skits on SNL in the early 1990's. For example:

Chris Farley interviewing Paul McCartney:

Chris Farley: Um, hi. Welcome to The Chris Farley Show. I'm.. Chris Farley.. and, my guest tonight is.. one of the.. greatest musicians.. uh, rock musicians. I guess, songwriter, ever. [ Smacks himself ] GOD! That sounds stupid! God, I'm an idiot!

So I wrote:

I think of Chris Farley smacking his head saying "IDIOT!" when I hear "protect the data."

"Oh right, that's what we should have been doing for the last 10, 20, 30 years -- protect the data! I feel so stupid to have not done that! IDIOT!"

In other words, supposedly "protect the data" is the silver bullet we should have been pursuing for the last several decades, and only now we're realizing it! Cue Farley: "Idiot!"
Kevin, you said in your post Six Myths of Information Security (cont'd):

Myth #2 -- The standard model of perimeter security protects the enterprise...

When we look over the large number of data breach events that our customers prevented using DLP and when we see publicly reported breach events that clearly would have been stopped by our software had we been there - it's hard not to conclude that a central failing of the standard model is that it does not protect the data itself. The standard model does a fine job of protecting the containers of confidential data (firewalls protect the LAN, endpoint protection protects the hosts, access control protects the apps and files) but the data itself is left completely unprotected by these countermeasures.

Please tell me what is "the data itself" is. This might help focus the discussion. Thank you.
John said…
One more container, human memory. Read about Soviet/Russian defector Mitrokhin and our own Jonathan Pollard. They mostly took what they knew, talked about and saw that day and wrote it down when they went home that evening. Richard touches on that when he talks about unauthorized/authorized users, but if we're making lists of data containers, we need to include that.

You're 100% right - I've never seen SNL or the great Chris Farley performing that skit. Even though I was in the USA, and SNL has been broadcast here in Australia on cable for a few years.

However, it still comes across with a real lack of irony and a very strong case FOR protecting containers.

For the record, I'm all for protecting containers, but only those containers that I've identified as holding or passing suitably classified data that we care about.

At the end of the day, if someone is willing to leak information, we have a HR problem, not a technical problem. I think too many security professionals forgot that very simple weak link - humans.

Looking forward to more of your posts - always thought provoking.

Unknown said…
I'm a little late to the party, but as usual I agree with your practicality, Richard. It's not so much that protecting data is worthless, it just can't be something we put all our eggs into or naively think we can solve that problem absolutely.

I also firmly believe in what you said at the end, and I think too many people take stances/approaches that disregard this simple truth:

"...no one in history has ever found a way to defeat crime, espionage, or any of the true names for the so-called "information security" challenges we face."

Popular posts from this blog

Five Reasons I Want China Running Its Own Software

Cybersecurity Domains Mind Map

A Brief History of the Internet in Northern Virginia