One of my many criticisms of climate scientists is their use of adjustments to temperature data to supposedly correct for errors in the measurements, corrections which in my opinion are probably not needed for errors that are not real. These corrections come in two main types: homogenization and breakpoint adjustments.
In the case of homogenization, records from neighbouring stations (and the definition of what constitutes a neighbouring station can be somewhat variable) are used to create an average expected temperature for that location, with differences in latitude and elevation compensated for during the process. This homogenization is used to infill missing monthly data points in each record. But it is also used to define the monthly reference temperatures (MRTs) that then define the monthly anomalies.
Breakpoint adjustments, or changepoint adjustments (see this PDF from NOAA) as they are alternatively called, are supposedly used to correct for false trends in the data. This generally means adjusting the slope of all sets of station data so that they look more or less the same, and more importantly have the same general trends as those quoted by the IPCC. So a station like Jakarta Observatorium in Indonesia (Berkeley Earth ID:155660) which actually has a very large warming trend of 1.84 °C per century, and has had since 1870 (and therefore has a total warming since 1870 of over 2.6 °C), gets adjusted down so that its trend is only 0.95 °C. This is because its warming is too high to fit with the IPCC narrative of only 1.0 °C of warming in the Southern Hemisphere since 1900.
On the other hand, Dubbo (Darling Street) in New South Wales (Berkeley Earth ID:152082) which also has temperature data dating from about 1870, but instead has a negative warming (or cooling) trend of -0.32 °C per century, gets adjusted up so that its trend becomes +0.56 °C per century, and thus closer to accepted "real" value of 1.0 °C per century.
If this all sounds a bit fishy, then welcome to the wonderful and wacky world of climate science, where nothing is quite as it seems. Central to all these data corrections is the assumption that most of the underlying data is reliable, but more importantly, that it is possible to detect the bad data from the good data. The questions, is any of this true? Is most of the data good? Can we really detect the small amount of bad data? And can the good data actually be so unreliable, or subject to so many unknown hidden variables, that it looks like bad data? One way to test this is by comparing data from stations that are very close neighbours.
As I pointed out in Post 41, the Netherlands has a number of stations that are located very close to a neighbouring station. In fact I have identified nine pairs of stations in the Netherlands where both stations have over 480 months of data, where there is significant temporal overlap of their data (i.e. they have a lot of months where both stations have active data), and where their spatial separation is less than 1.6 km (or about one mile for those dinosaurs from the USA who can't do metric). This allows direct comparisons of data to be made for stations that are, or should be, virtually identical. It is worth noting here that for this purpose the Netherlands has another unique advantage: it is very flat. That means that we do not need to worry about temperature differences occurring between stations due to differences in altitude.
In order to test the reliability of these temperature records I will apply three tests to their data. The first will look at the difference in the mean temperature of each set of station data in the pair. Ideally this should be zero, but there may be a systematic offset between stations due to local geography that could be significant. Such a difference would not necessarily raise question-marks over the validity of the data.
The second test will look at the difference in monthly temperatures between the two stations over time. The issue here is how much randomness is there in the temperature difference, and how significant is it. This will be measured by calculating the standard deviation of the temperature difference. Again, I would expect to see a low value here with noise levels in this data being at least at least a factor of √30 less than the accuracy of the daily mean temperature of each station (which I would estimate conservatively at 1 °C). Overall, this suggests that the standard deviation of this dataset should be less than 0.2 °C, and probably less than 0.1 °C.
Finally, I will look at the trend of the difference in temperature over time. If this is significantly large and comparable to the trends seen in the anomaly data for either station, that would indicate significant reliability problems with this type of data.
The results of these three test are summarized below for each of the nine pairs of stations.
Case 1: Soesterberg
Fig 43.1: The difference is monthly mean temperatures for two stations at Soesterberg. The mean of the monthly differences is 0.17 °C, the standard deviation of the differences is 0.27 °C, and the trend in the differences is -0.29 ± 0.10 °C per century.
The two stations at Soesterberg are BE-92835 (trend of +2.55 °C per century) and BE-139138 (trend of +2.29 °C per century). According to Berkeley Earth they are 1.06 km apart.
Case 2: Schiphol
Fig 43.2: The difference is monthly mean temperatures for two stations at Schiphol. The mean of the monthly differences is 0.09 °C, the standard deviation of the differences is 0.17 °C, and the trend in the differences is 0.017 ± 0.056 °C per century.
The two stations at Schiphol are BE-18517 (trend of +2.53 °C per century) and BE-157005 (trend of +2.12 °C per century). According to Berkeley Earth they are 1.2 km apart.
Case 3: Valkenberg
Fig 43.3: The difference is monthly mean temperatures for two stations at Valkenberg. The mean of the monthly differences is 0.18 °C, the standard deviation of the differences is 0.20 °C, and the trend in the differences is -0.07 ± 0.09 °C per century.
The two stations at Valkenberg are BE-174609 (trend of +2.29 °C per century) and BE-157004 (trend of +1.65 °C per century). According the Berkeley Earth they are 0.25 km apart.
Case 4: Eindhoven
Fig 43.4: The difference is monthly mean temperatures for two stations at Eindhoven. The mean of the monthly differences is 0.10 °C, the standard deviation of the differences is 0.20 °C, and the trend in the differences is 0.20 ± 0.06 °C per century.
The two stations at Eindhoven are BE-18478 (trend of +2.31 °C per century) and BE-156991 (trend of +2.06 °C per century). According to Berkeley Earth they are 1.42 km apart.
Case 5: Volkel
Fig 43.5: The difference is monthly mean temperatures for two stations at Volkel. The mean of the monthly differences is 0.10 °C, the standard deviation of the differences is 0.23 °C, and the trend in the differences is 0.20 ± 0.07 °C per century.
The two stations at Volkel are BE-92832 (trend of +2.31 °C per century) and BE-156995 (trend of +2.10 °C per century). According to Berkeley Earth they are 0.81 km apart.
Case 6: Gilze Rijen
Fig 43.6: The difference is monthly mean temperatures for two stations at Gilze Rijen. The mean of the monthly differences is 0.11 °C, the standard deviation of the differences is 0.30 °C, and the trend in the differences is -0.01 ± 0.09 °C per century.
The two stations at Gilze Rijen are BE-18485 (trend of +2.41 °C per century) and BE-156994 (trend of +1.93 °C per century). According to Berkeley Earth they are 0.16 km apart.
Case 7: Deelen
Fig 43.7: The difference is monthly mean temperatures for two stations at Deelen. The mean of the monthly differences is 0.11 °C, the standard deviation of the differences is 0.25 °C, and the trend in the differences is -0.13 ± 0.09 °C per century.
The two stations at Deelen are BE-18506 (trend of +2.50 °C per century) and BE-157001 (trend of +1.78 °C per century). According to Berkeley Earth they are 1.62 km apart.
Case 8: Rotterdam
Fig 43.8: The difference is monthly mean temperatures for two stations at Rotterdam. The mean of the monthly differences is 0.21 °C, the standard deviation of the differences is 0.21 °C, and the trend in the differences is -0.26 ± 0.14 °C per century.
The two stations at Rotterdam are BE-18497 (trend of +2.17 °C per century) and BE-18496 (trend of +1.80 °C per century). According to Berkeley Earth they are 0.89 km apart.
Case 9: Hoek Van Holland
Fig 43.9: The difference is monthly mean temperatures for two stations at Hoek Van Holland. The mean of the monthly differences is 0.07 °C, the standard deviation of the differences is 0.29 °C, and the trend in the differences is 0.50 ± 0.18 °C per century.
The two stations at Hoek Van Holland and BE-156999 (trend of +1.95 °C per century) and BE-18500 (trend of +1.62 °C per century). According to Berkeley Earth they are 0.87 km apart.
Summary
The three measures I have used to assess the reliability of the temperature records are the difference in the mean temperatures of various pairs of stations, the standard deviation of that difference in monthly temperatures between the two stations, and the magnitude of the trend difference in monthly temperatures. It is important to point out that the data used in the analysis shown in the figures above was the raw monthly temperature data, and not the monthly anomaly data. Overall, the results can be summarized as follows.
1) The difference in mean temperatures
The data shown above for nine pairs of stations indicates that in each case the mean temperature of the two stations can differ by up to 0.2 °C. In fact the mean difference is about 0.13 °C. The question we then need to answer is, is this difference in line with expectations based on known measurement accuracies for the actual data? Or is it determined by other factors such as random variations in the local climate or systematic differences due to differing local environments?
The expected error in the difference in mean temperatures comes from two main sources. One arises from the error in calculating the mean temperature of each station, while the second comes from the expected temperature difference due to their spatial separation.
In order to estimate the first error we start with the original measurement error in the mean daily temperature. This should be less than 1 °C. Then, as each station has over 480 months of data, and each month is itself the average of approximately 30 daily readings, the total number of daily readings being averaged for each station will be N ≥ 30x480. This implies that N ≥ 14400. Now statistical theory states that the error in measuring the mean temperature of a particular station over N readings should be a factor of √N less than the error in a single daily mean temperature measurement. So, this component of the error should be less than 1/120 of 1 °C, in other words less than about 0.008 °C. Combining the error from second station will increase this error by a factor of √2 to give 0.012 °C
The second error component can be estimated by looking at how the global mean temperature changes with latitude. At the equator mean temperatures are about 25 °C, while around the Arctic Circle they drop to near zero. this implies that mean temperatures drop by about 1 °C for every 300 km of latitude. As the two stations in each station pairs are never more than about 1.5 km apart, this implies a maximum difference in temperature due to location of about 0.005 °C.
Combining the two errors above (by summing their squares) give a combined maximum expected error of 0.013 °C. This is an order of magnitude less than what we observe. This suggests the difference in the mean temperatures is too high to be solely due to measurement uncertainties, even if we allow for differences in local geographical location. It seems likely that local environment differences are the dominant factor here, but these will probably be in the form of fixed temperature offsets that should not impact significantly on the anomaly data over time. If they do, then there will be evidence for this in the form of excessive differences in the trends.
2) The standard deviation
The mean standard deviation of the monthly temperature differences for the nine pairs of stations shown in the figures above is 0.24 °C. While this is much less than the standard deviation of the monthly anomalies of individual stations (typically about 1 °C), it is still significant.
At the start of this post I suggested that 0.2 °C should be a more likely upper limit for the standard deviation, based on the measurement accuracy of the daily mean temperatures, and the number of daily readings that combine to form the monthly mean temperature. This will be heavily dependent on the accuracy of the mean daily temperature, though.
If the daily mean temperature measurements have an error or uncertainty of 1 °C, then combining 30 of them into a monthly mean will decrease the error or uncertainty for the monthly mean by a factor of √30. However, then comparing the monthly means of two different stations will increase the error in the temperature difference by √2, so overall, the error in the difference in monthly temperatures should be a factor of √15 less than the error in a single mean daily temperature. This is approximately what we see.
3) The long term trend of the temperature difference
Of the three test results, this is probably the most surprising. While one might expect adjacent stations to experience a relative offset in their local temperatures, or differences due to statistical fluctuations over time, generally one would expect their temperature trends to follow each other. Yet the data shown above suggests otherwise.
Overall, the various station pairs exhibited a wide range of trends for their difference in monthly temperatures over time, as illustrated in the figures above. The mean trend seen for the first five pairs of stations (ignoring sign) is approximately 0.15 °C per century. This seems much higher than I would intuitively expect, but is it?
The difference in the trends is likely to be related to the uncertainty in the trends for the anomalies of each station dataset. These depend to the standard deviation of the residuals and inversely with the length of the dataset. For any best fit or trend line the error in the gradient can by estimated by dividing the standard deviation of the residuals by the standard deviation of the x-values multiplied by the square root of the number of x-values.
In this case the residual is effectively the difference in monthly mean temperatures between stations, and the x-values are the time axis in the graphs above. The standard deviation of the x-values is roughly 12 years and there are roughly 400 points, while the standard deviation of the residuals is effectively 0.24 °C. This suggests that the trend seen in the temperature difference data is likely to be in the range ±0.001 °C per year, or ±0.1 °C per century. Again this is roughly what we see, although the actual trends in the graphs shown above are about double this value, so maybe there is some additional (but relatively small) influence here due to differences in the local environment for the two stations over time.
Conclusions
The analysis above indicates that even weather stations that are located close together can yield significantly different results from each other for their temperature trends, mean temperatures and temperature distributions over time, just through the presence of known measurement errors. These differences between nearby stations are much greater than I expected to see before I performed this analysis, but are generally consistent with the measurement data and known sources of error. What it does indicate, though, is that even the best data is not that accurate, reproducible or reliable. Given the lack of long term temperature data for many parts of the world, this raises questions over the accuracy of any climate analysis that relies on this imperfect data.