Security in the Real World

I received the following from a student in one of my classes. He is asking for help dealing with security issues. He is trying to perform what he calls an "IDS/IPS policy review," which is a tuning exercise. I will apply some comments inline and some final thoughts at the end.

If you recall, I was in one of your NSO classes last year. At the end of the day the only place I am able to use everything I learned is at home.

This is an example of a security person knowing what should be done but unable to execute in the real world. This is a lesson for all prevention fanboys. You know the type -- they think 100% prevention is possible. In the real world, "business realities" usually interfere.

As you are aware with other corporate environments, our company goes by Gartner rating on products and ends up buying a technology where you don't get any kind of data, but just an alert name. So that is a pain within itself.

Here we see a security analyst who has been exposed to my Network Security Monitoring ideas (alert, full content, session, and statistical data), but is now stuck in an alert-centric world. I have turned down jobs that asked me to leave my NSM sources behind a supervise alert ticketing systems. No thanks!

There is this issue that I have been running into and thought maybe you can help me, if you are free. I work for this pretty large organization with 42000 users in 15 states... in a dynamic ever-changing environment, what is the best route for policy review?

We have two different IDS technologies across the company. The IDS/IPS policy review is really for turning off the signatures that we don't need to know about and cutting down on alerts, so that alert monitoring becomes easier. Since we use ArcSight for correlation, its easier to look for our interesting traffic.

Wait a minute -- I thought ArcSight and other SIM/SEM/SIEM/voodoo was supposed to solve this problem? Why disable anything if ArcSight is supposed to deal with it?

Right now, we are dealing with just a very high volume of alerts, there is no way we are going to be able to catch anything. In other small environments, I have been able to easily determine what servers we have/havent and turn on only those that are needed. For example, if we are not running frontpage, we can turn off all frontpage alerts. In our environment, it will be difficult to determine that and often times, we have no idea what changes have taken place.

This is an example of a separation between the security team and the infrastructure team. This poor analyst is trying to defend something his group doesn't understand.

Therefore, our policy review needs to cover all the missing pieces as well. By that, I mean we have to take into consideration the lack of cooperation across the board from other teams, when we disable or enable alerts.

Here we see the effects of lack of cooperation between security and infrastructure groups. When I show people how I build my own systems to collect NSM data, some say "Why bother? Just check NetFlow from your routers, or the firewall logs, or the router logs, etc..." In many places the infrastructure team won't let the security team configure or access that data.

(1) Going by the firewall rule - Feedback from my team is that we wont know about the firewall rule changes, if any change were to occur, hence we can't do it. Another is they trust the IDS technology too much and they say "well you are recommending turning off a high level alert "There is a reason why vendor rates it high".

It seems the infrastructure team trusts the security vendor more than their own security group.

(2) Going by applications that are running on an environment - This has proven to be even more
difficult, since there is no update what has been installed and not.

Again, you can't monitor or even secure what you don't understand.

(3) Third approach for external IPS - Choose those that aren't triggered in the past three months, review those and put them in block mode - In case something were to be open by the firewall team, they will identify it being blocked by something, it will be brought to our attention then we can unblock it after a discussion.

This is an interesting idea. Run the IDS in monitoring mode. Anything that doesn't trigger gets set to block mode when the IDS becomes an IPS! If anyone complains, then react.

None of this has been approved thus far. as you know I used to work for [a mutual friend] and we had small customers like small banks and so forth. With them the policy review was much easier.

Smaller sites are less complex, and therefore more understandable.

Lets pick one example. I recommended at one point that we can disable all mysql activity on
our external IDSs, since they are blocked by the firewall anyway, so that we dont have to see thousands and thousands of scans on our network on port 1434 all the time. Even that didn't get approved. The feedback for that was IDS blocking the alerts can take some load off the firewall.

So, this is a complicated topic. I appreciate my former student permitting me to post this anonymously. Here's what I recommend.

  1. Perform your own asset inventory. You first have to figure out what you're defending. There are different ways to do this. You could analyze a week's worth of session data to see what's active. You could look for the results of intruder scans or other recon to find live hosts and services. You could conduct your own scan. You could run something like PADS for a week. In the end, create a database or spreadsheet showing all your assets, their services, applications, and OS if possible.

  2. Determine asset security policy and baselines. What should the assets you've inventoried be doing? Are they servers only accepting requests from clients? Do the assets act as clients and servers? Only clients? What is the appropriate usage for these systems based on policy?

  3. Create policy-based detection mechanisms. Once you know what you're protecting and how they behave, devise ways to detect deviations from these norms. Maybe the best ways involve some of your existing IDS/IPS mechanisms -- maybe not.

  4. Tune stock IDS/IPS alerts. I described how I tune Snort for Sys Admin magazine. I often deploy a nearly full rule set for a short time (one or two days) and then pare down the obviously unhelpful alerts. Different strategies apply. You can completely disable alerts. You can threshold alerts to reduce their frequency. You can (with systems like Sguil) let alerts fire but send them only to the database. I use all three methods.

  5. Exercise your options. Once you have a system you think is appropriate, get a third party to test your setup. Let them deploy a dummy target and see if you detect them abusing it. Try client-side and server-side exploitation scenarios. Did you prevent, detect, and/or respond to the attacks? Do you have the data you need to make decisions? Tweak your approach and consider augmenting the data you collect with third party tools if necessary.

I hope this is helpful. Do you have any suggestions?


godfadda said…
Our purchase of ArcSight was more for log aggregation than just purely correlation. I was more than happy using Sguil for launching investigations against our Snort alerts. Rather than have to deal with 2 Monitoring Consoles, I created this simple perl script to launch off of events/alerts. It uses ssh/scp to talk to a sensor(running grab the full content data and launch Wireshark.

Point being, if your stuck using a different "SIM/SEM/Log Aggregator", extend it and employ NSM concepts.
Anonymous said…
So what exactly can "NSM" provide over a good SEIM tool? SEIM/SEM/SIM are tools like squil or any other analytical tool and are as good as the data available to the overall system and the users wielding the tool. Products like Intellitactics or ArcSight have correlation engines + external analysis tools and can easily be integrated with Squil (if necessary) - actually in most cases you can reduce the need for deep packet analysis or at least you can focus the efforts of that analysis. Session information is a great piece of data - that is also available in a SEIM/SIM (as long as the event source has the information). I'm not sure I understand the hesitation about using these tools - it seems to be a purely commercial versus open source debate.

If your using your SEIM for log aggregation only - you bought the wrong product. You bought a John Deere Tractor to mow a potted plant. Tools are Tools - use your head and they'll serve you well. If you don't understand the value they bring - ask someone who does.

First, if you're calling it "squil" instead of Sguil I doubt you have any clue what Sguil is.

Second, Sguil isn't something you "integrate" with a product like Intellitactics or ArcSight.

Third, it's not about commercial vs open source at all.

Fourth, if you don't see the need for "deep packet analysis" that indicates to me you aren't responsible for this sort of work and don't have to solve the sorts of problems I describe.

Fifth, I don't need to ask someone about SIM/SEM/SIEM/whatever value. I actually paid my own fee to attend the SANS Log Management Summit to try to fill any gaps in my understanding of these products. I came away knowing there is already a backlash against this product type and people are wondering why they spent high six figure amounts and their security woes have not diminished.
godfadda said…
The reason I wrote the script was to add "deep packet analysis" to any event within the SIM not just IDS alerts.

What NSM gave us before we even heard of SEIMS was a methodology for solving the issues I was facing when working with IDS alerts. I simply carried that way of doing things into the SEIM world, adding "full content" for the icing on the cake.

And we aren't just using the SEIM for log aggregation, we just hadn't correlated different log types as easily before the SEIM to understand it.

If someone is using Sguil and wants to correlate logs as well, I would recommend using the free version of Splunk. Splunk accepts permalinks into searches(passing an Ip pair with a time range,launched off a Sguil event) for display into its own interface, as well as accepting SOAP requests for integration into the Sguil interface. There is definitely value in correlating (host,firewall,router,vpn,etc.) logs to assist in investigations.

I think you realize my last comment wasn't directed to you, but I wanted to make sure.

I like what you've done. It's a great example of using ArcSight as a place to begin an investigation and then following through with other sorts of analysis.

I plan to write a blog post later about what SIM/SEM/SIEM *is* good for later.
Anonymous said…
Thanks for the input in this matter. I really like the passive monitoring idea for asset inventory. It is for two reasons. One, it won't cause any damage to any of the devices due to its passive nature, unlike some of the active scanners (when run in aggressive mode) and secondly it is easier to get approved from the management(which has been most of my headache).

As recommended in (4) tuning was also part of the policy review exercise, as you mentioned. This is the only one that was approved vastly and it is currently being done.

As for ArcSight, we are feeding in several logs for correlation and have rules created looking for any interesting traffic. The standard rules that are shipped out of the box are so generic and turning them on has caused more alerts for us to deal with. Hence, the interesting traffic is defined solely based on what we know and there is a lot we don't know and keeps changing.

I had to turn to policy review option in order for us to enable the real-time monitoring effective along with the ArcSight rule notification, since creating rules for every scenario isn't possible without the knowledge of the assets. Currently, the real time monitoring is absolutely pointless with this volume of alerts.

Oh well, I am going to see if I can get PADS installation approved first and start developing from that.

Thanks so much for you help.

Anonymous said…
I recently implemented one of the largest deployments of a SIM/SIEM system in the world collecting >100,000+ events per second from thousands of devices. The knowledge required to successfully implement a SIM/SIEM is founded on good network security monitoring practices. That means: 1). gathering all the data I can get my hands on (IDS, firewalls, proxy logs, router/switch, system logs, argus/netflow data etc.), 2). understanding everything I am collecting, 3). knowing the resources to go to when I have questions, 4). having an accurate inventory with device risk levels, 5). knowing your environment..
Knowing your environment is extremely important; however, in an organization of 180,000+ users, you can't know it all, so knowing who to go to becomes a necessity. Now, given all of that, your entire SOC staff cannot have that level of knowledge. At this point, it becomes a staff that reacts to alerts generated by a tool. Hopefully, you have some excellent engineers creating the alerts that know most of the items listed above. When we created alerts, we documented exactly what the alert means, what you should do when you see it, incorporating all of our knowledge into the alert notes. This allows you to pass along some of the knowledge of the so-called experts, to the operators. Even so, the operators still have to have the skills to understand the detailed data such as reading packets, FW logs, proxy logs, system logs etc. In the end, most alerts require further investigation, and for this to occur, and be accurate, the operators need the skills the so-called experts have to some extent, but not necessarily the same knowledge. In the end, a SIM is another weapon in your’s a shame that many orgs just implement them and expect them to do all the work…but then again, I know of many organizations that do the same with IDS/IPS systems as well. The tool doesn’t replace your knowledge and skill…
Anonymous said…

Did you use a commercial product or homegrown? If commercial, who did you go with?
Venkata Achanta said…
I totally agree with Bob,tools like SIM/SIEM are useless unless you build processes around them to make intelligent use of the information provided.

Its not the failure of the technology itself but lack of processes and procedures around the tool set.

Its sad to see people blaming technology for their lack of understanding.

Popular posts from this blog

MITRE ATT&CK Tactics Are Not Tactics

Zeek in Action Videos

New Book! The Best of TaoSecurity Blog, Volume 4