Friday, August 31, 2007

Economist on Models

I intended to stay quiet on risk models for a while, but I read the following Economist articles and wanted to note them here for future reference.

From "Statistics and climatology: Gambling on tomorrow":

Climate models have lots of parameters that are represented by numbers... The particular range of values chosen for a parameter is an example of a Bayesian prior assumption, since it is derived from actual experience of how the climate behaves — and may thus be modified in the light of experience. But the way you pick the individual values to plug into the model can cause trouble...

Climate models have hundreds of parameters that might somehow be related in this sort of way. To be sure you are seeing valid results rather than artefacts of the models, you need to take account of all the ways that can happen.

That logistical nightmare is only now being addressed, and its practical consequences have yet to be worked out. But because of their philosophical training in the rigours of Pascal's method, the Bayesian bolt-on does not come easily to scientists. As the old saw has it, garbage in, garbage out. The difficulty comes when you do not know what garbage looks like.
(emphasis added)

This is only a subset of the argument in the article. Note that this piece includes the line "derived from actual experience of how the climate behaves." I do not see the "Bayesian prior assumption[s]" so often cited elsewhere to even be derived from experience, if you take experience to mean something grounded in recent historical evidence instead of opinion. This willingness to rely on arbitrary inputs only aggravates the problems cited by the Economist.

The accompanying piece "Modelling the climate: Tomorrow and tomorrow" further explains some of the problems I've been describing previously.

While some argue about the finer philosophical points of how to improve models of the climate (see article), others just get on with it. Doug Smith and his colleagues at the Hadley Centre, in Exeter, England, are in the second camp and they seem to have stumbled on to what seems, in retrospect, a surprisingly obvious way of doing so. This is to start the model from observed reality.

Until now, when climate modellers began to run one of their models on a computer, they would “seed” it by feeding in a plausible, but invented, set of values for its parameters. Which sets of invented parameter-values to use is a matter of debate. But Dr Smith thought it might not be a bad idea to start, for a change, with sets that had really happened...

[T]he use of such real starting data made a huge improvement to the accuracy of the results. It reproduced what had happened over the courses of the decades in question as much as 50% more accurately than the results of runs based on arbitrary starting conditions.

Hindcasting, as this technique is known, is a recognised way of testing models. The proof of the pudding, though, is in the forecasting, so Dr Smith plugged in the data from two ten-day periods in 2005 (one in March and one in June), pressed the start button and crossed his fingers.
(emphasis added)

This is exactly what I've been saying. Use "observed reality" and not "plausible, but invented" opinions or "arbitrary starting conditions" and I'll be happy to see what the model produces.


jbmoore said...
This comment has been removed by the author.
jbmoore said...

Follow the above link, read the essay and decide for yourself.

If you plug arbitrary values into an equation, doesn't that make you a mathematician?