Saturday, September 01, 2007

Is Digital Security "Risk" a Knightian Uncertainty?

I've subscribed to the Economist for over ten years, and it's been worth every penny. Today I noticed the following in an article called The Long and Short of It:

The second paper suggests that traders face “Knightian uncertainty”, or risks that cannot be measured.

Hmm, what is this "Knightian uncertainty"? I found the following excerpt from Risk, Uncertainty and Expected Utility:

Much has been made of Frank H. Knight's (1921: p.20, Ch.7) famous distinction between "risk" and "uncertainty". In Knight's interpretation, "risk" refers to situations where the decision-maker can assign mathematical probabilities to the randomness which he is faced with. In contrast, Knight's "uncertainty" refers to situations when this randomness "cannot" be expressed in terms of specific mathematical probabilities. As John Maynard Keynes was later to express it:

"By `uncertain' knowledge, let me explain, I do not mean merely to distinguish what is known for certain from what is only probable. The game of roulette is not subject, in this sense, to uncertainty...The sense in which I am using the term is that in which the prospect of a European war is uncertain, or the price of copper and the rate of interest twenty years hence... About these matters there is no scientific basis on which to form any calculable probability whatever. We simply do not know." (J.M. Keynes, 1937)

Nonetheless, many economists dispute this distinction, arguing that Knightian risk and uncertainty are one and the same thing. For instance, they argue that in Knightian uncertainty, the problem is that the agent does not assign probabilities, and not that she actually cannot, i.e. that uncertainty is really an epistemological and not an ontological problem, a problem of "knowledge" of the relevant probabilities, not of their "existence".

Going in the other direction, some economists argue that there are actually no probabilities out there to be "known" because probabilities are really only "beliefs". In other words, probabilities are merely subjectively-assigned expressions of beliefs and have no necessary connection to the true randomness of the world (if it is random at all!).

Nonetheless, some economists, particularly Post Keynesians such as G.L.S. Shackle (1949, 1961, 1979) and Paul Davidson (1982, 1991) have argued that Knight's distinction is crucial. In particular, they argue that Knightian "uncertainty" may be the only relevant form of randomness for economics - especially when that is tied up with the issue of time and information.

In contrast, situations of Knightian "risk" are only possible in some very contrived and controlled scenarios when the alternatives are clear and experiments can conceivably be repeated -- such as in established gambling halls. Knightian risk, they argue, has no connection to the murkier randomness of the "real world" that economic decision-makers usually face: where the situation is usually a unique and unprecedented one and the alternatives are not really all known or understood. In these situations, mathematical probability assignments usually cannot be made. Thus, decision rules in the face of uncertainty ought to be considered different from conventional expected utility.
(emphasis added)

The Wikipedia entry on Uncertainty is also interesting.

That is really fascinating. It sounds like a school of thought believes the real world may be too complex too model. It also sounds like stepping foot into the world of appreciating uncertainty is a huge undertaking, given the amount of prior research.

4 comments:

Alex Hutton said...

When I ran into Knight's work some time ago, at first it, too, seemed to point me into a downward spiral of "we can't know what we don't know".

Note that they key distinction between the two schools of thought centers around this statement you quote:

some economists argue that there are actually no probabilities out there to be "known" because probabilities are really only "beliefs". In other words, probabilities are merely subjectively-assigned expressions of beliefs

This, of course, is very interestingly opposite modern Bayesian probability theorists who state that uncertainty and degrees of belief can be measured or expressed as probabilities.

And perhaps this is where you and I miscommunicate. FAIR does not say Risk=X where X is any precise amount. FAIR (and other Bayesian risk expressions) say that Risk is a probability - that is, a statement of belief. So when a business person asks "how much risk is there in non-encrypted email" we can give some answer that is, at least, defensible.

Note also the disagreement between the Frequentist (or objectivist) and the Bayesian (subjectivist, who actually consider themselves more objective than the frequentist) concerning uncertainty.

Consider the following thought experiment.

A Bayesian and an Objectivist are asked to analyze the results of an experiment. The experimentalist need to know the value of a particular parameter from an experimental measurement.

Someone does the experiment, collects the data, and sends one copy to a Bayesian and another to an Objectivist (Frequentist) in the following:

Consider the following thought experiment.

A Bayesian and an Objectivist are asked to analyze the results of an experiment. The experimentalist needs to know the value of a particular parameter from an experimental measurement.

Someone does the experiment, collects the data, and sends one copy to a Bayesian and another to an Objectivist (Frequentist).

The Bayesian is told the true - but unknown - value can't be 100 and it can't be 6,000 because the experimental apparatus can not record data outside these values. The Bayesian assigns a simple prior distribution (probability) for the parameter value. They decide to use an uninformative, flat prior distribution with a lower bound of 100 and an upper bound of 6,000. This prior includes all the information they know about the parameter.

So where's the subjectivity? There is none. The Bayesian is forced to decide and quantify what they believe the estimate will be (between 100 and 6,000).
The Bayesian estimate for the parameter is 5,800 with an uncertainty (one standard deviation in the posterior) of 5,200. The PDF tells us the probability for the estimate is very low (because the uncertainty is very high). The Bayesian simply says we know very little about the parameter because the probability for the parameter's true - but unknown - value is as likely (within one standard deviation) to be 1,600 as it is to be 11,000. Such a result could be explained by data with an extremely low information content. The Bayesian concludes data with improved information content must be collected.

The objective, frequency-of-occurrence analysis says the parameter value is 6,200 +/- 800. The Objectivist was also told limits of the apparatus. The Obectivist must conclude the parameter value is 6,200 is physically impossible. A rational objectivist would then subjectively decide (based on their prior knowledge) that since the experimentalist told them this value is impossible, the data must be bad and the experiment should be redone. While this conclusion is rational, it is subjective. The Objectivist is not qualified to test, calibrate or evaluate the performance of the apparatus. The Objectivist does not know if they received the right data (they do know know the facts about the chain-of-custody). Their knowledge of the apparatus and the data is hearsay and would not be allowed in a objective court of law. The objective result is rejected based on an ad hoc (subjective) assumption about the measurement (it doesn't make sense, so it must be wrong).

There is nothing more for the Objectivist to do. However the Bayesian has an option. Their result could also be explained if data quality actually is acceptable, but the prior distribution was miss-assigned. They decide to change the prior distribution assignment. The prior bounds are increased to be from 0 to 8000. The Bayesian has quantitatively included additional objective information gleaned from the first analysis. This is not a subjective decision. It is based on logic. Specifically the first result could be due to low quality data, or it could be due to a miss-assigned prior.

The Bayesian analysis with the new prior distribution gives an estimate of 6,200 with an uncertainty of 1,600. The Bayesian concludes that the assignment of the prior bounds in the initial analysis is flawed. This is not a subjective conclusion.

When the experimentalist learns the results, they decide to double check everything. To their embarrassment, they discover data from another apparatus that can measure values from 0 to 10,000 was mistakenly sent to both parties.

The Objectivist rejected their result based on an ad hoc decision about the experiment. The Bayesian quantitatively refined their prior information based on the results from their initial analysis. Both parties computed a reasonable parameter estimate. Both parties knew something was wrong based on what they knew. However the Objectivist could only say "This parameter estimate doesn't make sense. It can't be right. I'm concerned."

The Bayesian can say something objective. Once the probability density distributions from both analyses are normalized, the Bayesian can compute and quantitatively compare the probabilities of the two parameter estimates. In this hypothetical example, the probability of the prior bounds being right in second analysis would be many orders of magnitude greater than the initial prior bounds. The Bayesian rejected their initial analysis based on a quantitative comparisons of two computed probabilities.

The advantage of FAIR is that it quantitatively incorporates what is not known. What is not known must be assigned. One person may subjectively decide that the frequency of "low and slow" attacks that are difficult to detect in a useful timeframe (or the amount of 0 days, etc) are much higher than anyone imagines them to be. Another person may subjectively decide that if attacks invisible to IDS detection can not be measured, than they are irrelevant. Both subjective assumptions can be quantitatively evaluated using FAIR. What other method can quantitatively and objectively compare risk based on two very different subjective points of view?

Kees Leune said...

I will start out by saying that I am not (very) familiar with FAIR, and that I have generally developed a profound dislike of statistics; even though I do acknowledge the added value that it might have, in certain circumstances.

@Alex: you write:

Both parties knew something was wrong based on what they knew. However the Objectivist could only say "This parameter estimate doesn't make sense. It can't be right. I'm concerned."


In my opinion, that is precisely the correct response. Coming from a relational databases background, I hold the notion of Integrity very high. Data should be complete and correct.

The use of statistical probability in the example almost seems to be a way to defend the use of data that is known to be incorrect. That can surely never be a proper use of statistics? Assigning a high level of uncertainty to data that is known to be incorrect, seems to be wrong. It is not uncertain that it is correct; it is quite certain to be incorrect.

The other observation you make:

A rational objectivist would then subjectively decide (based on their prior knowledge) that since the experimentalist told them this value is impossible, the data must be bad and the experiment should be redone. While this conclusion is rational, it is subjective. The Objectivist is not qualified to test, calibrate or evaluate the performance of the apparatus.

It does not matter. The objectivist known that something is wrong because she is getting readings that cannot be read. Whether those readings are the result of a bad measurement, faulty equipment, or because of errors in communications does not matter. Ignoring the fact that the information is wrong, for whatever reason, is incorrect and in some cases, may even be dangerous.

But; the most important consideration is: for what reason is the contested reading used? For some information, it is not so relevant how accurate it is, as long as a decent trend can be deduced. For others, information that is 99% certainly correct might be too uncertain.

As you might deduce from my position, I usually consider myself to be an experimentalist/objectivist.

Measuring is knowing, provided you know what you measure. (sounds a lot better in Dutch; it rhymes:)

Alex said...

"Measuring is knowing, provided you know what you measure. (sounds a lot better in Dutch; it rhymes:)"

And that's the point of the above exercise. If you cannot/will not account for the unknowns, you cannot/ will not be measuring. You are sweeping your assumptions under the carpet - rather than accounting for them.

It is an American example, I apologize, but the Objectivist states that "Barry Bonds is the all-time home run champion with 758 home runs" - there's nothing less objective than the number of balls flying over a fence. The Subjectivist, however, states that Barry is an admitted steroid user, and that Hank Aaron played in an era where the hitters faced distinct disadvantages (the height of the mound, racial inequality, etc). He then attempts to account for those variables before making a statement (and would probably seek to normalize Sadaharu Oh's 868 home runs in Japan). These are all important pieces of prior information, but the value of which cannot be as empirically satisfactory as "ball flies over fence".

Gunnar said...

Frank Knight has a famous rebuttal to Kelvin

“If you can’t measure, measure anyway."

The risk and uncertainty distinction is important and it makes me think of the value investor Mohnish Pabrai whose investing philosophy is characterized as low risk, high uncertainty - "heads I win, tails I dont lose much"; and this might be something we can strive for in infosec. It is next to impossible to get low uncertainty in most enterprise systems.

"Dhandho. In Sanskrit, it literally means “Endeavours that create wealth”.

When applying Dhandho to investment, Pabrai added a few more principles. One is to invest in companies with high uncertainty. If there are many possible outcomes for a company in trouble, its share price will suffer – even if in fact most of the outcomes are positive. Markets often in practice treat high uncertainty as a high risk of a big loss. The two are not the same."

also you can read Risk, Uncertainty, and Profit online

http://www.google.com/search?client=safari&rls=en&q=Risk,+Uncertainty+and+Profit+by+Frank+Knight&ie=UTF-8&oe=UTF-8