Thursday, August 03, 2006

Forensically Sound Evidence

Mike Murr pointed me to his blog post Forensically Sound Duplicate. He suggests replacing this definition of a forensically sound duplicate with the one that follows.

"A 'forensically-sound' duplicate of a drive is, first and foremost, one created by a method which does not, in any way, alter any data on the drive being duplicated. Second, a forensically-sound duplicate must contain a copy of every bit, byte and sector of the source drive, including unallocated 'empty' space and slack space, precisely as such data appears on the source drive relative to the other data on the drive. Finally, a forensically-sound duplicate will not contain any data (except known filler characters) other than which was copied from the source drive."

This is Mike's replacement:

"A forensically sound duplicate is a complete and accurate representation of the source evidence. A forensically sound duplicate is obtained in a manner that may inherently (due to the acquistion tools, techniques, and process) alter the source evidence, but does not explicitly alter the source evidence. If data not directly contained in the source evidence is included in the duplicate, then the introduced data must be distinguishable from the representation of the source evidence. The use of the term complete refers to the components of the source evidence that are both relevant, and reasonably believed to be relevant."

I agree with the statement "A forensically sound duplicate is a complete and accurate representation of the source evidence." That is broad and still accurate enough to refer to hard drives, memory, or network traffic. I'm not comfortable with the alteration portion of the suggested definition.

24 comments:

Spence said...

Network forensic imaging and analysis applications (read EnCase Enterprise) modify the drive in order to acquire it. So, while information was introduced to the drive, thereby modifying it, that information is known and would not materially alter the evidence contained on the drive.

Obviously policies could be introduced into an environment that would allow for the EEE server to be resident on the machine prior to the forensic process, however, there are times when a surgical strike needs to be done on a non-compliant PC.

Would you say that network acquisitions are not an acceptable method of performing a forensic process if said network acquisitions modify the data on the drive through logon, logs, or introduction of software to initiate imaging, if those are known quantities?

Richard Bejtlich said...

I see no problem with network acquisitions. Maybe Mike should say "A forensically sound duplicate is obtained in a manner that does not materially alter the source evidence." ?

Michael Murr said...

It's funny that you mention "does not materially alter the source evidence", that was a word-for-word definition that came up during email conversations ;)

I was shying away from "materially alter" since loading up an imaging tool into RAM could overwrite remnants of a deleted process in memory, which could have evidentiary value. Also, logging in to gather disk images can materially alter the source disk (e.g. overwrite unallocated space which could also have evidentiary value). I don't know how often this type of scenario would happen, but it's there. That's why the "A forensically sound duplicate is obtained in a manner that may inherently (due to the acquistion tools, techniques, and process) alter the source evidence" part of the definition got introduced. I skipped materially alter and expanded it into data alteration inherent to the acquisition, and data alteration that is explicit to the tool.

To me the wording "data alteration that is explicit to the tool", is a little awkward. I'm still looking for a better way of stating this without making it into another paragraph in and of itself :)

Richard Bejtlich said...

Hi Mike,

I see what you mean now. How about

"A forensically sound duplicate is obtained in a manner that does not materially alter the source evidence, except to the minimum extent necessary to obtain the evidence." ?

Keydet89 said...

Richard,

I can see the original point you made, and I like the changes you proposed. I don't want to kill a good idea by adding too many words, but at some point, should there be mention of something along the lines of obtaining the evidence via an applicable/justified/documented means?

What I'm trying to get at is the need to have documentation with regards to not only the means used to obtain the evidence, but also the justification for using that means.

Richard Bejtlich said...

Harlan -- good idea. Any suggested wording?

Keydet89 said...

Let's see...building on your mods...

"A forensically sound duplicate is obtained in a manner that does not materially alter the source evidence, except to the minimum extent necessary to obtain the evidence. The manner used to obtain the evidence must be documented, and should be justified to the extent applicable."

Note, given that I am also a veteran, I am not simply peppering the verbage with "must" and "should"...I've carefully considered this and I believe that both terms have their rightful place.

Michael Murr said...

The part about "does not materially alter the source evidence, except to the minimum extent necessary to obtain the evidence" can be a little hard to define. What if dd makes a bigger footprint (when running in memory) than dcfldd (I don't know if this is true, just making up an example). So, images obtained from a live system using dd are not forensically sound since dcfldd alters the source evidence less? Craig and I went back and forth about this a lot, and it led to the realization of a couple of things:

1) The altering inherent in acquisition is different than explicit steps taken to alter the evidence. This is a similar vein to the Observer effect/Heisenberg Uncertainty Principle/etc.

2) There comes a point where the circumstances also come into play. Not having a "forensically sound duplicate" because your imaging program takes up 3 bytes more than the next imaging program (exaggerated for emphasis) is perhaps a little unreal. Forensic examiners/analysts/scientists/etc. aren't perfect, we work with what we have.

2.1) To combine and extend the previous two thoughts, realize that the acquisition (including what evidence was duplicated and how it was duplicated) feeds the analysis component. There are two categories of analysis, analysis that uses deductive logic and analysis that uses inductive logic (there is analysis that uses both, but the "combined" analysis can be broken down into its deductive and inductive components). With deductive reasoning, the conclusions are contained (sometimes implicitly) in the premises. With inductive reasoning the conclusions go beyond what is present (even implicitly) in the premises. You can only really make deductive conclusions based on what evidence was obtained. Inductive conclusions are easier to attack because they are based on evidence you don't have (perhaps couldn't gather due to the acquisiton process).

From this point of view, "altering" the source evidence (whether it be inherent or explicit) doesn't really invalidate the deductive conclusions (because they are based on what you were able to acquire). Altering of the source evidence could lead to questions about the reliability of the conclusions (both deductive and inductive) because another examiner can't recreate the image (which equates to reducing the stability of the premises, a.k.a. the duplicate being analyzed). By focusing the definition on the acquisition only, we're assuming that the examiner/analyst/scientist/etc. performing the analysis understands the difference between what can be deduced and what can be inferred inductively. So, even if your imaging program uses "3 bytes more than my imaging program" what you may not be able to deduce as much , but the deductions you make are still valid. What you can infer by inductive reasoning however may be less. (Probably not with just 3 bytes difference, but you get the idea).

3) The focus of a forensically sound duplicate (at least in my original blog entry) is on the acquisition. Acquiring a "forensically sound duplicate" of some source evidence (or component of source evidence) is normally part of a larger process. I agree that there must be documentation as to how the evidence was duplicated, but I don't think it belongs in the definition of a "forensically sound duplicate". Instead it lies in the larger process (a correct, reliable, and repeatable examination process which has acquisition as a component). I like Harlan's wording, I think he did a good job of placing the "must" and "should" :)

Keydet89 said...

What if dd makes a bigger footprint...

Mike, I think that this is a good argument for tool testing and verification...something we're supposed to do anyway. Under the circumstances, and given the wording I added last night, if the means is documented and justified, then it should suffice.

For example, take your difference between dd and dcfldd. Any differences between the tools should be thoroughly documented. Given a live acquisition of a Windows XP system, you're going to have entries for each created in the Prefetch directory, as well as beneath the UserAssist key for the account that was active during the acquisition. These changes can/should/must be documented, and the responder must know the differences when deciding which tool to use.

Given 2.1, when testifying in court, shouldn't the analyst only give the facts?

WRT 3), I agree. The documentation for the process doesn't need to lie with the definition...however, it is necessary.

Harlan

Tim said...

I had a couple of thoughts on this thread; one caveat first, though, I don't have much experience in forensics or incident response, so this is pretty much an outsider's PoV.

As far as what constitutes the minimum alterations necessary to obtain a forensically sound duplicate of evidence, I don't think it's going to be very useful to try and come up with categorical rules such as "'foo' must be used instead of 'bar' because 'blah'". In some cases there may be tools which should absolutely never be used, but in most cases it's going to depend on the circumstances. There was a good example earlier of taking an image from a live system, where any tool already in place is likely to be preferable to one that's not, simply because then you don't have to install a new program. As I understand it, that's why Harlan is focusing on justification and documentation over defining exactly what is an acceptable amount of change; you want to give analysts the flexibility to use their judgement to make decisions based on the circumstances, but you also want to be able to verify that the resulting evidence is trustworthy. Not only does that sort of documentation give you a paper trail, but it also influences the analyst's decisions - if you know you will have to justify your reasoning later, you're apt to be more careful about justifying it to yourself up front, which, intuitively at least, should produce more sound reasoning and better decisions.

Also, I have to disagree with the suggestion that analysts testifying in court should give only the facts. As I understand it, the purpose of using an expert witness in a case is to provide interpretation for facts that the jury cannot be expected to interpret themselves. My impression is that any other type of forensic evidence (ballistics, DNA evidence, autopsy results, etc.) is presented by an expert witness who also gives their interpretation of what that evidence means in the context of the case; I think that's just as appropriate for digital forensic evidence.

I think that in other types of forensics, there is a great deal of latitude given to trained and approved investigators to exercise their judgement in acquiring and handling evidence, but there are also guidelines and procedures in place to try to ensure that people are accountable for that judgement. My impression is that that's probably a good direction for digital forensics to take as well. The difficulty is in getting away from trying to take an algorithmic approach to the problem, which seems to be the natural response for computer science people (myself included).

-Tim

Keydet89 said...

Tim,

With regards to the analyst or an expert witness testifying, I can see your point with regards to interpretation. You're right, facts do need to be interpretted for the jury, and one would hope that the attorneys on each side would see that this is done appropriately.

With regards to documentation, yes, that's what I'm pushing for.

Harlan

Anonymous said...

This is Craig Ball. Sorry this may come up as anonymous, but I didn't want to open yet another account to leave a post. Many worthwhile thoughts here, if only to get us reflecting that forensically sound isn't necessarily something we can precisely define for every situation.

Even in the familiar instance of imaging a hard drive removed from a system (such that live acquisition isn't a factor), we understand that a couple of unreadable sectors may prevent the acquisition of those sectors or even of larger contiguous blocks, depending upon the tool employed. I doubt any experienced examiner would contend that an acquisition isn't forensically sound because there were a few bad sectors (assuming, of course that the tools and methodology employed were competent and the examiner got the contents of as many sectors as was technically feasible under the circumstances). We typically wouldn't send a drive in for clean room extraction or scanning tunneling electron microscopy for a bad sector or two, unless (possibly) later analysis showed that those sectors just happened to fall within the case making/breaking evidence. Life is too short. Cost is too high, and the realities are that reliable data can be gleaned from the remaining 234 million-odd sectors on that 120GB drive!

If we are compelled to alter the data using best available methods, I think it's important that we be able to quantify the extent of the footprints we've left at the scene. Just as at a physical crime scene, sometimes you have to stomp around. But just as at a physical crime scene, you choose your steps with care, monitor and record the route you take, and (as much as possible) make sure you're not stomping on critical evidence. For acquisition of an offline drive, we can demand exacting standards, in part because it's not that difficult for a properly trained examiner to meet those standards. When you're talking about data acquisition from an old fax machine or a failing drive, you do the very best you can and stand ready to fully explain and justify your actions. Most importantly of all, you've got to know the limits of the data you've acquired; that is, where and how is it rendered unreliable due to changes in acquisition or other factors (i.e., the media may have supplied a byte of data for the time value but the OS doesn't use that byte, so the time displayed is fanciful).

What I want to avoid is diluting the standard for competent action for those media and circumstances where we can and should demand a rigorous standard (e.g., offline hard drives). In our efforts to accommodate circumstances necessitating compromise (live acquisition of data in volatile memory), we shouldn't strive for a one-size-fits-all definition if that opens the door to sloppier forensics.

Michael Murr said...

So my response (4 comments back) was responding to both Harlan and Richard. I'll respond to Harlan's first since it is the shorter of the two :)

Harlan, in regards to documentation (it appears) we both agree that documentation is a "must", but that the documentation requirement doesn't belong in the definition of forensically sound duplicate.

In regards to Richard's suggested wording of "...except to the minimum extent necessary to obtain the evidence", there are two side effects (introduced by the wording) that I wanted to avoid. First it doesn't differentiate between data alteration that is inherent to the acquisition, and explicit data alteration. Second, the wording also sets a relatively high standard as to what tools can generate a "forensically sound duplicate".

If we go with the standard "except to the mimimum extent necessary", then what tools actually meet this? Running dd under Windows Forensic Toolchest makes a larger footprint than just running dd by itself. So does this mean that Windows Forensic Toolchest can not yield forensically sound duplicates as it doesn't alter the source evidence "to the minimum extent necessary"? Also, Craig brings in a good point (and this was contained in Harlan's post as well) that if you document what data was altered/lost/introduced (and can defend your actions) then you should be on solid ground, regardless of the "minimum extent necessary" wording.

The differentiation between the two types of data alteration (inherent and explicit) in my original wording was "A forensically sound duplicate is obtained in a
manner that may inherently (due to the acquistion tools, techniques, and process) alter the source evidence, but does not explicitly alter the source evidence." I left this intentionally broad with the implicit understanding that there will be documentation to go with the duplicate and so as to not restrict the use of tools that collect valid evidence but don't have the smallest footprint.

I think an analogy might clarify things a bit. Consider a person that has been stabbed and is lying dead on the ground. Collecting a specimen of blood is analagous to creating an image of a hard disk (or RAM). In both cases the actual collection of the evidence needs to preserve the original evidence as much as is reasonable. In both cases, there also needs to be supporting documentation, but the documentation itself doesn't determine whether the collection contaminated/invalidated the evidence. In addition, documenting your path through the crime scene is part of a larger strategy for evidence collection.

Keydet89 said...

Mike and Craig,

Excellent points. I fully agree with Craig's point of not "dilute the standard for competent action"...the wording specifies and implies many important aspects of the community, such as "competent".

I also agree that there should not be an effort for a one-size-fits-all definition or standard, for the reason Craig mentioned.

Mike, I do think that a requirement (a "must") for the documentation requirement does need to associated with the definition in some direct manner, although I agree that it does not need to be in the actual definition.

Harlan

Anonymous said...

Mike,

> Collecting a specimen of blood is analagous to creating an image of a
> hard disk (or RAM). <

An interesting analogy which is made possible only because the subject is conveniently dead. We might imagine what would be the consequences if the subject were alive and we collected all the evidence. :-) Historically, digital evidence has most often been introduced under a document production theory; hence the need to show that a particular document is a true and accurate copy of the original. I doubt however that any serious forensic technician would consider a blood sample to be a copy of the original. A sample is not a copy, it is a *sample*. People who apply any of these definitions to the "live" response scenario inevitably end up focusing on a subset of the available evidence, namely, the evidence that they collected. If a computer is running when you arrive on the scene you *will* throw away at least some evidence irrespective of what course of action you follow. A better question would be how to collect reliable evidence since clearly forensic duplication is not always an option.

- Rossetoecioccolato.

Michael Murr said...
This comment has been removed by a blog administrator.
Michael Murr said...

Oops, I hit the login and publish when i meant to hit preview... Here's the last post (which I deleted so I could fix some typos):

Here's another approach...

Let's break this down into two components, the collection (i.e. copying of bits by use of some software [or potentially hardware] tool) and the requirements for admissibility into court.

The only real requirements for the collection tool are:
1) The tool creates an accurate representation of the source.
2) Any introduced data is distinguishable from the representation of the source.

Tools can be tested for these types of requirements. Now I'm not advocating sloppy forensics, just focusing this section soley on copying bits from one place to another.

Dealing with the requirements for admissibility into court is a different story. I don't think there are any hard-fast rules/laws that say that the original evidence can not be altered as a result of the collection in order for the evidence to be admitted. There are a number of guidelines and best-practices related to digital evidence collection that are used to help with the issues surrounding evidence admissibility.

I'm a little leary of creating a definition for evidence collection that includes best practices and guidelines as requirements unless you say something like:
"A forensically sound duplicate is an accurate representation of the source where any introduced data is distinguishable from the representation of the source and follows these best practices and guidelines when applicable: ..." Making a guideline or best-practice mandatory in a definition doesn't make much sense since it's a guidline and/or best practice not a hard-fast rule. By wording a defintion in this way (or other similar manner) we are keeping the actual collection distinct from the admissibility in court.

As far as to what guidelines and best practices to include, I think the one that has been a real stickler is the alteration of the source evidence. Perhaps rewording "minimum data alteration" into something more abstract along the lines of "evidence preservation"? For example: "The collection should take (all) reasonable steps to preserve the source." What is considered "reasonable" can vary due to the medium and circumstances.

Also, I'm curious as to what people would consider as the goal for a definition of "forensically sound duplicate"? Are we just trying to get something that is likely to be admissible in court? Are we striving for scientific rigor? Are we going for just a definition or an actual standard? The more I'm reading our posts, the more I'm starting to think we've all got different goals.

Michael Murr said...

Sigh... I could have sworn I hit the preview button again... :/

At any rate, (ignoring the obvious guidline instead of guideline typo) I did want to add something after the sentence "What is considered "reasonable" can vary due to the medium and circumstances." If we want to come up with a list of what is "reasonable" for various medium/circumstance combinations would we really just in essence be applying other standards/guidelines/best practices for evidence preservation, or listing hard-fast technical requirements?

Keydet89 said...

Michael,

IMHO, we have been well on our way to a pretty solid definition. For practical purposes, splitting the collection from the requirement for court admissability may be a good approach, but I don't see that there's any need to go back and rewrite the definition.

Perhaps removing "duplication" from the definition would be appropriate.

George/Rossetoecioccolato makes a good point...any time you're dealing with a live system, you're going to end up "throwing away at least some evidence". This is true to an extent, even if you do nothing...the Order of Volatility is a good guide post for this. Once you start interacting with the system, this becomes more pronounced. However, as we've discussed, implementation needs to be separated from the definition.

I would be very interested in any wording that Rossetoecioccolato might suggest.

Finally, admissability in court should always be the goal of a definition or process such as what we've been discussing. We should start with the definition...the actual standard is going to be something else entirely, and I believe it may end up being more of a process and education than an actual hard and fast standard.

Harlan

Keydet89 said...

I'd like to make a suggestion with regards to this thread...is it possible to move this to another forum? I can easily see how the TaoSecurity blog is not necessarily going to scale easily.

In the meantime, I'm going to put some thought into coming up with some actual wording, reworking or simply redoing what we've got so far. After posting my previous response, I took a look at terms such as "evidence", "evidence dynamics", etc., and I'm beginning to wonder if "duplication" should even be part of the definition.

Michael Murr said...

I'd be quite interested in the removal of "duplication", going down this path can lead to some interesting alternatives.

As far as hosting this discussion, we can move to forensicfocus.com, or I can add another forum over at libforensics.org (I already have the software in place). Another possibility is to set up a (temporary?) mailing list.

Any preferences?

Michael Murr said...

[It's a bad night for me and the preview button]

There is also the possibility of making this a sourceforge project (using sourceforge to host the mailing lists, forums, etc.)

Craig said...

I applaud the point made about sampling versus replication. A bit of blood may be sufficiently complete for DNA testing, but you couldn't perform an autopsy. Most of my work is akin to autopsy. If the body is contaminated in collection, the autopsy is complicated by the presence or absence of potentially revealing evidence. Likewise, a single bit flip caused by the wrong cable may be enough to make a big block of intelligible data inacessible to me. If I can't validate the streams by hashing, just how much difference is present and what are the boundaries of change? The definition we accept shouldn't open the door to new avenues of cross-examination that might be avoided by by what Michael aptly terms "scientific rigor."

I suggest that we direct our efforts in support of "scientific rigor" rather than simply "admissibility." The standard for admissibility generally is low, and for expert opinions concerning scientific evidence, the touchstone is (essentially) "scientific rigor." The Daubert standard addresses matters such as peer review and acceptance within the discipline; consequently, if we focus sufficiently on the scientific rigor, admissibility will follow. It's not enough just to get the evidence in. We need standards tending to insulate the evidence from attack on process--standards grounded on the integrity of collection and replicability/verifiability.

I think we are on the right track separating the standard for collection from analysis and admissibility. Ideally, two able forensic examiners will look at the same data and reach the same conclusions (what we mean often when we say "the ones and zeroes don't lie"). But, now-and-then there's room for interpretation, and honest differences of opinion between reasonable people.

With collection, the process should be objective, and improvisation greeted with skepticism. I anticipate that the roles of data collector and examiner will soon diverge, especially in my area, civil litigation. I expect to see collection responsibilities devolve upon less expensive data harvest technicians, trained in collection but not analysis. We've seen this evolution in other forensic disciplines, where the person who collects the blood or trace isn't the analyst. As civil litigation demands broader preservation duties than those of review and production, it won't make economic sense for forensic examiners to be imaging drives or collecting other onsite data.

If I'm right, and the collector won't share the examiner's skill set, the definition needs to be rigorous to insure integrity of the process. It must be as objective as possible, leaving little room for compromse of the evidence. Right now, our experience and credibility tend to carry the day in court. When we are relying upon collection by someone else, all we have is the objective integrity of the process and the credentials of the collector as, e.g., a Certified Collection Technician.

So the question is not what standard will we demand of ourselve, but what rigor will we demand of a stranger if our sworn testimony hinges on those efforts?

Keydet89 said...

Craig,

Good points, all.

Mike, Libforensics.org or Sourceforge...either one. You seem to have access to one, and are more familiar than I with the other.

Pick one.

Harlan