Fig. 57.1: The number of weather stations with temperature data in the Northern Hemisphere since 1700 according to Berkeley Earth.
There are four major problems with global temperature data.
1) It is not spread evenly
Only about 10% of all available data covers the Southern Hemisphere (compare Fig. 57.2 below with Fig. 57.1 above), while in the Northern Hemisphere over half the data is from the USA alone (as shown in Fig. 57.3). In addition, there is no reliable temperature data covering the oceans from before 1998 when the Argo programme for a global array of 3000 autonomous profiling floats was proposed. The Argo programme has since been used to measure ocean temperatures and salinity on a continuous basis down to depths of 1000m across most of the oceans between the polar regions, but that means we only have reliable data for the last 20 years.
The result is that only land-based data is available before 1998, and this tends to cluster around urban areas. The solution to this clustering employed by climate scientists is to resort to techniques such as gridding, weighting and homogenization.
Gridding involves creating a virtual grid of points across the Earth's surface, usually 1° of longitude or latitude apart. This is limited by two factors: computing power and data coverage. As there are unlikely to be any weather stations at these grid points, unless by coincidence, virtual station records are created at these points by averaging the temperatures from the nearest real stations. This averaging of stations is not equal. Instead the average usually weights the different stations according to their closeness in distance (although even stations 1000 km away can be included) and their correlation to the mean of all those datasets. This process of weighting based on correlation is often called homogenization.
Fig. 57.2: The number of weather stations with temperature data in the Southern Hemisphere since 1750 according to Berkeley Earth.
2) It does not go back far enough in time
As I have shown previously, the earliest temperature records are from Germany (see Post 49) and the Netherlands (see Post 41) and go back to the early 18th century. However, there is no Southern Hemisphere temperature data before 1830, and only two datasets in the USA from before 1810. The principal reason is that the amount of available data is positively correlated with economic development. As more countries have industrialized, the number of weather stations has increased. Unfortunately, climate change involves measuring the change in temperature since a previous epoch or reference period (over say 100 or 200 years), and in those times the availability of data is much, much, worse. So increasing the quality of current data cannot increase the quality of the measured temperature change. This will always be constrained by how much data we had in the distant past.
Fig. 57.3: The number of weather stations with temperature data in the United States since 1700 according to Berkeley Earth.
3) The data is often subject to measurement errors
Over time weather stations are often moved, instruments are ungraded, and the local environment changes as well. The conventional wisdom is that all these changes have profound impacts on the temperature records that need to be compensated for. This is the rationale behind data adjustments. The problem is, none of it is really justified, as I will demonstrate in this post.
If there are problems with the temperature data at different times and locations, these issues should be randomly distributed. That means any adjustments to correct these errors should be randomly distributed as well. This in turn means that averaging a sufficiently large number of stations for a regional or global trend should result in the cancellation of both the errors and the adjustments. As I have shown in many previous posts here, this does not happen. In fact in many cases the adjustments can add (or subtract) as much, or even more, warming (or cooling) to the mean trend than is present in the original data, particularly in the Southern Hemisphere. For examples see my posts for Texas, Indonesia, PNG, the South Pacific (East and West), NSW, Victoria, South Australia, Northern Territory and New Zealand among others.
One contentious issue is the problem of station moves or changes to the local environment. The conventional wisdom is that these will both strongly affect the temperature record. Frankly, I disagree. In my view those who say they will are failing to understand what is being measured. One example is, what would happen if the weather station was to be moved from open ground to an area under a large tree? Does the increased shade reduce the temperature? The answer is no because the thermometer is already in the shade inside its Stevenson screen. Moreover, the thermometer is measuring air temperature, not the temperature on the ground, and the air is continuously circulating. So the air under the tree is at virtually the same temperature as the air above open ground. The one adjustment that does affect temperature is altitude. Air (almost) always gets colder as you ascend in height.
4) There just isn't enough data
There are currently about 40,000 weather stations across the globe. This sounds like a lot, but it is only about one for every 13,000 square kilometres of area. That means that on average, these stations are over 110 km apart, or more than 1° of longitude or latitude. Even today, that is probably the bare minimum of what is required to measure a global temperature. Unfortunately, in previous times, the availability of data was much, much, worse.
Of course, now there are alternatives. One is to use satellites, but again this only provides data back to about 1980. The other problem with satellites is that their orbits generally no not cover the polar regions. And finally, they can only see what is emitted at the top of the atmosphere (TOA). So they can measure temperatures at the TOA, but measuring surface temperatures can be problematic as the infra-red radiation emitted by the surface is largely absorbed by carbon dioxide and water vapour in the lower atmosphere.
Over the course of the last eleven months I have posted 56 articles to this blog. Over half of these have analysed the surface temperature trends in various countries, states and regions. In virtually every case, the trend I have determined by averaging station anomalies has differed from the conventional widely publicized versions. These differences are largely due to homogenization and data adjustments.
Homogenization
There are two potential issues with homogenization. Firstly, there are more urban stations than rural ones. This is because stations tend to be located near to where people live. Secondly, urban stations tend to be closer together. So they are more likely to be strongly correlated. As homogenization uses correlation for weighting the influence of each station's data in the mean temperature for the local region, this means that the influence of urban stations will be stronger.
So both potential issues are likely to favour urban stations over rural ones. Yet it is the urban ones that are more likely to be biased due to the urban heat island (UHI) effect. The result is that that bias is often transmitted to the less contaminated rural stations, thereby biasing the whole regional trend upwards. This is why I do not use homogenization in my analysis. The other problematic intervention is data adjustment.
Data adjustments
The rationale for data adjustments is that they are needed to compensate for measurement errors that may occur from changes of station site, instrument or method. The justification for using them is that climate scientists believe they can identify weak points in the data. Some might call that hubris. The alternative viewpoint is that these adjustments are unnecessary and that averaging a sufficiently large sample will erase the errors automatically via regression to the mean. I will now demonstrate that with real data.
Fig. 57.4: The 5-year average temperature trends for Austria, Hungary and Czechoslovakia together with best fit lines for the interval 1791-1980 (m is the gradient in °C per century). The Austria and Czechoslovakia data are offset by +2°C and -2°C respectively to aid clarity.
In three recent posts I calculated and examined the temperature trends for Czechoslovakia (Post 53), Hungary (Post 54) and Austria (Post 55). The five-year moving averages of the temperature trends in these three countries are shown in Fig. 57.1 above. What is immediately apparent is the high degree of similarity that these trends display, particularly after 1940. This is indicated by the red and black arrows which mark the positions of coincident peaks and troughs respectively in the three datasets.
It turns out that all three datasets are also very similar to that of Germany (see Post 49). This is shown in Fig. 57.5 below. This is not surprising as the four countries are all close neighbours. What is surprising is that there are not greater differences between the four datasets, particularly given the number of adjustments that Berkeley Earth felt needed to be made to the individual station records for these countries when undertaking their analysis.
Fig. 57.5: The 5-year average temperature trends for Austria, Hungary and Czechoslovakia compared to that of Germany.
To understand the potential impact of these adjustments, consider this. The temperature trend for Austria in Fig. 55.1 of Post 55 was determined by averaging up to 26 individual temperature records. Yet the total number of adjustments made to those records by Berkeley Earth in the time interval 1940-2013 was more than 90. That is more than three adjustments per temperature record, or at least one for every 21 years of data. Yet if the adjustments are ignored, and the data for each country is just averaged normally, the results for each country, Austria, Czechoslovakia, Germany and Hungary, are virtually identical. This leads to the following conclusions and implications.
Conclusions
1) The data in Fig. 57.5 indicates that the temperature trends for Austria, Czechoslovakia, Germany and Hungary are virtually identical after 1940. The probability that this is due to random chance is minimal. It therefore implies that the temperature trends for these countries from 1940 onwards are indeed virtually identical. This is not a total surprise as they are all close neighbours.
2) As all the individual temperature anomaly time series used to generate these trends are not identical, and all are likely to have data irregularities from time to time, this also means that those data irregularities are highly likely to be random in both their size and distribution across the various time series. This means that when averaged to create the regional trend, their irregularities will partially cancel. If the number of sites is large enough, the cancellation will be almost total. This is what is seen in Fig. 57.5, and it is why all the trends shown are virtually identical post-1940.
Implications
1) If the temperature trends for Austria, Czechoslovakia, Germany and Hungary are virtually identical after 1940, as conclusion #1 suggests, then it is reasonable to suppose that they should be virtually identical before 1940 as well. But they aren't, as the data in Fig. 57.5 illustrates. This is because the trends in each case are based on the average of too few individual anomaly time-series for the irregularities from each station time-series to be fully cancelled by the irregularities from the remainder. Before 1940 there are only sixteen valid temperature records in Austria, three in Hungary and three in Czechoslovakia. Germany, on the other hand has about thirty.
2) However, if it is true that all the temperature trends for Austria, Czechoslovakia, Germany and Hungary before 1940 should be the same, then there is no reason why we cannot combine them all into a single trend. This would dramatically increase the number of individual time-series being averaged, and so reduce the discrepancy between the calculated value for the trend and the true value. This has been done in Fig. 57.6 below.
Fig. 57.6: The temperature trend for Central Europe since 1700. The best fit
is applied to the interval 1791-1980 and has a negative gradient of
-0.05 ± 0.07 °C per century. The monthly temperature changes are defined
relative to the 1981-2010 monthly averages.
The data in Fig. 57.6 represents the temperature trend for the combined region of Austria, Czechoslovakia, Germany and Hungary. The trend after 1940 is the same as that seen in those individual countries and the gradient of the best fit line for 1791-1980 more closely resembles the equivalent lines for Germany and Hungary than it does those of Austria and Czechoslovakia. But now we also have a more accurate trend before 1940. The question is, how much more accurate?
Fig. 57.7: The number of station time-series included in
the average each month for the temperature trend in Fig. 57.6
The data from Austria, Czechoslovakia, Germany and Hungary suggest that approximately 20 different time-series are required in the average for the irregularities in the different station time-series to almost fully cancel. The graph in Fig. 57.7 suggests that this threshold is surpassed for almost every month of every year after 1830.
Fig. 57.8: The temperature trend for Central Europe since 1700. The best fit
is applied to the interval 1831-2010 and has a positive gradient of 0.62 ± 0.07 °C per century. The monthly temperature changes are defined
relative to the 1981-2010 monthly averages.
If we now calculate the best fit to the data in Fig. 57.8, but only use data after 1830, we get a gradient for the trend line of 0.62 °C per century. This equates to a temperature rise since 1830 of over 1.1 °C.
However, you could argue that the regional monthly average data in Fig. 57.6 is still reasonably accurate all the way back to 1780 as it continues to have over a dozen temperature records incorporated into the average every month of every year after this time. In which case the temperature rise since 1780, as indicated by the best fit line in Fig. 57.9, is actually less than 0.5 °C. This suggests that we can be reasonably confident that temperatures in central Europe between 1750 and 1830 were fairly similar to those of today.
Summary
What I have demonstrated here is that adjustments to the raw temperature data are unnecessary and can be avoided simply by averaging sufficient datasets (i.e. more than about 20).
I have also shown that it is highly likely that the mean temperature in central Europe is not much higher now than it was at the start of the Industrial Revolution (1750-1830).
Disclaimer: No data were harmed or mistreated during the writing of this post. This blog believes that all data deserve to be respected and to have their values protected.