Thursday, May 07, 2009

Logs from the Cloud

I received an email with the following notice today:

Amazon CloudFront Adds Access Logging Capability:

AWS today released access logs for Amazon CloudFront. Access logs are activity records that show you details about every request delivered through Amazon CloudFront. They contain a comprehensive set of information about requests for your content, including the object requested, the date and time of the request, the edge location serving the request, the client IP address, the referrer and the user agent. It’s easy to get started using access logs: you just specify the name of the Amazon S3 bucket you want to use to store the logs when you configure your Amazon CloudFront distribution. There are no fees for using the access logs, beyond normal Amazon S3 charges to write, store and retrieve the logs.



The Amazon Elastic MapReduce team has also built a sample application, CloudFront LogAnalyzer, that will analyze your Amazon CloudFront access logs. This tool lets you use the power of Amazon Elastic MapReduce to quickly turn Access Logs into the answers to the most commonly asked questions about your business. Additionally, several partners have also built solutions that help you analyze these access logs; you can find more information about these in the AWS Solutions Catalog.


Looking at the Developer Guide entry for Access Logs, we see the following sorts of data will be recorded:


The log files use the W3C extended log file format
(for more information, go to http://www.w3.org/TR/WD-logfile.html).

The files contain information for each record in the following order:

Date of the request (in UTC)
Time (when the server finished processing the request; in UTC)
Edge location that served the request
(a variable-length string with a minimum of 3 characters)
Bytes served
Client IP address (no hostname lookups occur)
HTTP access method
DNS name (either the CloudFront distribution name or your CNAME,
whichever the end user specified in the request)
URI stem (e.g., /images/daily-ad.jpg)
HTTP status code (e.g., 200)
Referrer
User agent

An entry might look like this:

#Version: 1.0
#Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status
cs(Referer) cs(User-Agent)

02/01/2009 01:13:11 FRA2 182 10.10.10.10 GET d2819bc28.cloudfront.net /view/my/file.html 200
www.displaymyfiles.com Mozilla/4.0%20(compatible;%20MSIE%205.0b1;%20Mac_PowerPC)

02/01/2009 01:13:12 LAX1 2390282 12.12.12.12 GET www.singalong.com /soundtrack/happy.mp3 304
www.unknownsingers.com Mozilla/4.0%20(compatible;%20MSIE%207.0;%20Windows%20NT%205.1)

I think this is a good start, but I'll leave it to Cloudsecurity.org for expert commentary!


Richard Bejtlich is teaching new classes in Las Vegas in 2009. Regular Las Vegas registration ends 1 July.

2 comments:

Craig Balding said...

Hi Richard

Thanks for the mention.

Honestly, I think you have the meat of it covered :-).

The key point for peeps to bear in mind is that this is just for Amazons Content Distribution Network service (aka CloudFront) - not AWS API usage. So this is just like Akamai giving you logs for content you've pushed to them for edge access. I guess its worth noting that the existing Amazon S3 storage service already has optional logging, disabled by default).

CloudFront logging improves on the previous situation where only usage statistics were available, hence this feature mostly helps Amazon clients that rebill CDN usage to their own customers.

From a security perspective I'm struggling to see much benefit (the corner case I can think of is if sensitive data was accidently leaked by a customer to CloudFront and they needed to know which IP's downloaded it).

If someone can see more security benefits, do share.

Cheers

Craig

andy said...

I always enjoy learning how other people employ Amazon S3 and CloudFront. I am wondering if you can check out my very own tool CloudBerry Explorer that helps to manage S3 on Windows . It is a freeware. We have recently added basic support for CloudFront log to make it easier to turn on logging and analyze it.