Progress in science is sometimes made by great discoveries. But science also advances when we learn that something we believed to be true isnt. When solving a jigsaw puzzle, the solution can sometimes be stymied by the fact that a wrong piece has been wedged in a key place.
In the scientific and political debate over global warming, the latest wrong piece may be the hockey stick, the famous plot (shown below), published by University of Massachusetts geoscientist Michael Mann and colleagues. This plot purports to show that we are now experiencing the warmest climate in a millennium, and that the earth, after remaining cool for centuries during the medieval era, suddenly began to heat up about 100 years ago–just at the time that the burning of coal and oil led to an increase in atmospheric levels of carbon dioxide.
I talked about this at length in my December 2003 column. Unfortunately, discussion of this plot has been so polluted by political and activist frenzy that it is hard to dig into it to reach the science. My earlier column was largely a plea to let science proceed unmolested. Unfortunately, the very importance of the issue has made careful science difficult to pursue.
But now a shock: Canadian scientists Stephen McIntyre and Ross McKitrick have uncovered a fundamental mathematical flaw in the computer program that was used to produce the hockey stick. In his original publications of the stick, Mann purported to use a standard method known as principal component analysis, or PCA, to find the dominant features in a set of more than 70 different climate records.
But it wasnt so. McIntyre and McKitrick obtained part of the program that Mann used, and they found serious problems. Not only does the program not do conventional PCA, but it handles data normalization in a way that can only be described as mistaken.
Now comes the real shocker. This improper normalization procedure tends to emphasize any data that do have the hockey stick shape, and to suppress all data that do not. To demonstrate this effect, McIntyre and McKitrick created some meaningless test data that had, on average, no trends. This method of generating random data is called Monte Carlo analysis, after the famous casino, and it is widely used in statistical analysis to test procedures. When McIntyre and McKitrick fed these random data into the Mann procedure, out popped a hockey stick shape!
That discovery hit me like a bombshell, and I suspect it is having the same effect on many others. Suddenly the hockey stick, the poster-child of the global warming community, turns out to be an artifact of poor mathematics. How could it happen? What is going on? Let me digress into a short technical discussion of how this incredible error took place.
In PCA and similar techniques, each of the (in this case, typically 70) different data sets have their averages subtracted (so they have a mean of zero), and then are multiplied by a number to make their average variation around that mean to be equal to one; in technical jargon, we say that each data set is normalized to zero mean and unit variance. In standard PCA, each data set is normalized over its complete data period; for key climate data sets that Mann used to create his hockey stick graph, this was the interval 1400-1980. But the computer program Mann used did not do that. Instead, it forced each data set to have zero mean for the time period 1902-1980, and to match the historical records for this interval. This is the time when the historical temperature is well known, so this procedure does guarantee the most accurate temperature scale. But it completely screws up PCA. PCA is mostly concerned with the data sets that have high variance, and the Mann normalization procedure tends to give very high variance to any data set with a hockey stick shape. (Such data sets have zero mean only over the 1902-1980 period, not over the longer 1400-1980 period.)
The net result: the principal component will have a hockey stick shape even if most of the data do not.