Climate Science Investigations: New Zealand

Showing posts with label New Zealand. Show all posts

Thursday, December 31, 2020

45. Review of the year 2020

I started this blog in May, in part to occupy my time during the Covid-19 lockdown. But I was also motivated by a growing dissatisfaction with the quality of data analysis I was witnessing in climate science, and in particular the lack of any objectivity in the way much of the data was being presented and reported. My concerns were twofold.

The first was the drip-drip of selective alarmism with an overt confirmation bias that kept appearing in the media with no comparable reporting of events that contradicted that narrative. The worry here is that extreme events that are just part of the natural variation of the climate were being portrayed as the new normal, while events of the opposite extreme were being ignored. It appeared that balance was being sacrificed for publicity.

The second was the over-reliance of much of the climate analysis on complex statistical analysis techniques of doubtful accuracy or veracity. To paraphrase Lord Rutherford: if you need to use complex statistics to see any trends in your data, then you would be better off using better data. Or to put it more simply, if you can't see a trend with simple regression analysis, then the odds are there is no trend to see.

The purpose of this blog has not been to repeat the methods of climate scientists, nor to improve on them. It has merely been to set a benchmark against which their claims can be measured and tested.

My first aim has been to go back to basics, to examine the original temperature data, look for trends in that data, and to apply some basic error analysis to determine how significant those trends really are. Then I have sought to compare what I see in the original data with what climate scientists claim is happening. In most cases I have found that the temperature trends in the real data are significantly less than those reported by climate scientists. In other words, much of the reported temperature rises, particularly in Southern Hemisphere data, result from the data manipulations performed by the climate scientists on the data. This implies that many of the reported temperature rises are an exaggeration.

In addition, I have tried to look at the physics and mathematics underpinning the data in order to test other possible hypotheses that could explain the observed temperature trends that I could detect. Below I have set out a summary of my conclusions so far.

1) The physics and mathematics

There are two alternative theories that I have considered as explanations of the temperature changes. The first is natural variation. The problem here is that in order to conclusively prove this to be the case you need temperature data that extends back in time for dozens of centuries, and we simply do not have that data. Climate scientists have tried to solve this by using proxy data from tree rings and sediments and other biological or geological sources, but in my opinion these are wholly inadequate as they are badly calibrated. The idea that you can measure the average annual temperature of an entire region to an accuracy of better than 0.1 °C simply by measuring the width of a few tree rings, when you have no idea of the degree of linearity of your proxy, or the influence of numerous external variables (e.g. rainfall, soil quality, disease, access to sunlight), is preposterous. But there is another way.

i) Fractals and self-similarity

If you can show that the fluctuations in temperature over different timescales follow a clear pattern, then you can extrapolate back in time. One such pattern is that resulting from fractal behaviour and self-similarity in the temperature record. By self-similarity I mean that every time you average the data you end up with a pattern of fluctuations that looks similar to the one you started with, but with amplitudes and periods that change according to a precise mathematical scaling function.

In Post 9 I applied this analysis to various sets of temperature data from New Zealand. I then repeated it for data from Australia and then again in Post 42 for data from De Bilt in the Netherlands. In virtually all these cases I found a consistent power law for the scaling parameter indicative of a fractal dimension of between 0.20 and 0.30, with most values clustered close to 0.25. The low magnitude of this scaling term suggests that the fluctuations in long term temperatures are much greater in amplitude than conventional statistical analysis would predict.

For example, in the case of De Bilt it suggests that the standard deviation in the average 100-year temperature is more than 0.2 °C. This means that there is a 16% probability of the mean temperature for any century being more than 0.3°C more (or less) than the mean temperature for the previous century, and therefore a one in six possibility of a 0.6 °C temperature rise in any given century. So a 0.6 °C temperature rise over a century could occur once every 600 years purely because of natural variations in temperature. It also suggests that similar temperature variations that we have seen in temperature data in the last 50 or 100 years might have been repeated frequently in the not so distant past.

ii) Direct anthropogenic surface heating (DASH) and the urban heat island (UHI)

Another possible explanation for any observed rise in temperature is the heating of the environment that occurs due to human industrial activity. All energy use produces waste heat. Not only that, but all energy must end up as heat and entropy in the end. The Second Law of Thermodynamics tells us that. It is therefore inevitable that human activity must heat the local environment. The only question is by how much.

Most discussions in this area focus on what is known as the urban heat island (UHI). This is a phenomenon whereby urban areas either absorb extra solar radiation because of changes made to the surface albedo by urban development (e.g. concrete, tarmac, etc), or tall buildings trap the absorbed heat and reduce the circulation of warm air, thereby concentrating the heat. But there is another contribution that continually gets overlooked - direct anthropogenic surface heating (DASH).

When humans generate and consume energy they liberate heat or thermal energy. This energy heats up the ground, and the air just above it, in much the same way that radiation from the Sun does. In so doing DASH adds to the heat that is re-emitted from the Earth's surface, and therefore increases the Earth's surface temperature at that location.

In Post 14 I showed that this heating can be significant - up to 1 °C in countries such as Belgium and the Netherlands with high levels of economic output and high population densities. In Post 29 I extended this idea to look at suburban energy usage and found a similar result.

What this shows is that you don't need to invoke the Greenhouse Effect to find a plausible mechanism via which humans are heating the planet. Simple thermodynamics will suffice. Of course climate scientists dismiss this because they assume that this heat is dissipated uniformly across the Earth's surface - but it isn't. And just as significant is the fact that the majority of weather stations are in places where most people live, and therefore they also tend to be in regions where the direct anthropogenic surface heating (DASH) is most pronounced. So this direct heating effect is magnified in the temperature data.

iii) The data reliability

It is taken as read that the temperature data used to determine the magnitude of the observed global warming is accurate. But is it? Every measurement has an error. In the case of temperature data it appears that these errors are comparable in magnitude to many of the effects climate scientists are trying to measure.

In Post 43 I looked at pairs of stations in the Netherlands that were less than 1.6 km apart. One might expect that most such pairs would exhibit identical datasets for the two stations in the pair, but they don't. In virtually every case the fluctuations in the difference in their monthly average temperatures was about 0.2 °C. While this was consistent with the values one would expect based on error analysis, it does highlight the limits to the accuracy of this data. It also raises questions about how valid techniques such as breakpoint adjustment are, given that these techniques depend on detecting relatively small differences in temperature for data from neighbouring stations.

iv) Temperature correlations between stations

In Post 11 I looked at the product moment correlation coefficients (PMCC) between temperature data from different stations, and compared the correlation coefficients with the station separation. What became apparent was evidence for a strong negative linear relationship between the maximum correlation coefficient for temperature anomalies between pairs of station and their separation. For station separations of less than 500 km positive correlations of better than 0.9 were possible, but this dropped to a maximum correlation of about 0.7 for separations of 1000 km and 0.3 at 2000 km.

There were also clear differences between the behaviour of the raw anomaly data and the Berkeley Earth adjusted data. The Berkeley Earth adjustments appear to reduce the scatter in the correlations for the 12-month averaged data, but do so at the expense of the quality of the monthly data. This suggests that these adjustments may be making the data less reliable not more so. The improvement in the scatter of the Berkeley Earth 12-month averaged data is also curious. Is it because it is this data that is used to determine the adjustments and not the monthly data, or is this not the case and instead there is some other reason? And what of the scatter in the data? Can we use this to measure the quality and reliability of the original data? This clearly warrants further study.

Fig. 45.1: Correlations (PMCC) for the period 1971-2010 between temperature anomalies for all stations in New Zealand with a minimum overlap of 200 months. Three datasets were studied: a) the monthly anomalies; b) the 12-month average of the monthly anomalies; c) the 5-year average of the monthly anomalies. Also studied were the equivalent for the Berkeley Earth adjusted data.

2) The data

Over the last eight months I have analysed most of the temperature data in the Southern Hemisphere as well as all the data in Europe that predates 1850. The results are summarized below.

i) Antarctica

In Post 4 I showed that the temperature at the South Pole has been stable since the 1950s. There is no instrumental temperature data before 1956 and there are only two stations of note near the South Pole (Amundsen-Scott and Vostok). Both show stable or negative trends.

Then in Post 30 I looked at the temperature data from the periphery of the continent. This I divided into three geographical regions: the Atlantic coast, the Pacific coast and the Peninsula. The first two only have data from about 1950 onwards. In both cases the temperature data is also stable with no statistically significant trend either upwards or downwards. Only the Peninsula exhibited a strong and statistically significant upward trend of about 2 °C since 1945.

ii) New Zealand

Fig. 45.2: Average warming trend of for long and medium stations in New Zealand. The best fit to the data has a gradient of +0.27 ± 0.04 °C per century.

In Posts 6-9 I looked at the temperature data from New Zealand. Although the country only has about 27 long or medium length temperature records, with only ten having data before 1880, there is sufficient data before 1930 to suggest temperatures in this period were almost comparable to those of today. The difference is less than 0.3 °C.

iii) Australia

Fig. 45.3: The temperature trend for Australia since 1853. The best fit is applied to the interval 1871-2010 and has a gradient of 0.24 ± 0.04 °C per century.

The temperature trend for Australia (see Post 26) is very similar to that of New Zealand. Most states and territories exhibited high temperatures in the latter part of the 19th century that then declined before increasing in the latter quarter of the 20th century. The exceptions were Queensland (see Post 24) and Western Australia (see Post 22), but this was largely due to an absence of data before 1900. While there is much less temperature data for Australia before 1900 compared to the latter part of the 20th century, there is sufficient to indicate that, as in New Zealand, temperatures in the late 19th century were similar to those of the present day.

iv) Indonesia

Fig. 45.4: The temperature trend for Indonesia since 1840. The best fit is applied to the interval 1908-2002 and has a negative gradient of -0.03 ± 0.04 °C per century.

The temperature data for Indonesia is complicated by the lack of quality data before 1960 (see Post 31). The temperature trend after 1960 is the average of between 33 and 53 different datasets, but between 1910 and 1960 it generally comprises less than ten. Nevertheless, this is sufficient data to suggest that temperatures in the first half of the 20th century were greater than those in the latter half. This is despite the data from Jakarta Observatorium which exhibits an overall warming trend of nearly 3 °C from 1870 to 2010 (see Fig. 31.1 in Post 31).

It is also worth noting that the temperature data from Papua New Guinea (see Post 32) is similar to that for Indonesia for the period from 1940 onwards. Unfortunately Papua New Guinea only has one significant dataset that predates 1940, so conclusions regarding the temperature trend in this earlier time period are difficult to ascertain.

v) South Pacific

Most of the temperature data from the South Pacific comes from the various islands in the western half of the ocean. This data exhibits little if any warming, but does exhibit large fluctuations in temperature over the course of the 20th century (see Post 33). The eastern half of the South Pacific, on the other hand, exhibits a small but discernible negative temperature trend of between -0.1 and -0.2 °C per century (see Post 34).

vi) South America

Fig. 45.5: The temperature trend for South America since 1832. The best fit is applied to the interval 1900-1999 and has a gradient of +0.54 ± 0.05 °C per century.

In Post 35 I analysed over 300 of the longest temperature records from South America, including over 20 with more than 100 years of data. The overall trend suggests that temperatures fluctuated significantly before 1900 and have risen by about 0.5 °C since. The high temperatures seen before 1850 are exclusively due to the data from Rio de Janeiro and so may not be representative of the region as a whole.

vii) Southern Africa

Fig. 45.6: The temperature trend for South Africa since 1840. The best fit is applied to the interval 1857-1976 and has a gradient of +0.017 ± 0.056 °C per century.

In Posts 37-39 I looked at the temperature trends for South Africa, Botswana and Namibia. Botswana and Namibia were both found to have less than four usable sets of station data before 1960 and only about 10-12 afterwards. South Africa had much more data, but the general trends were the same. Before 1980 the temperature trends were stable or perhaps slightly negative, but after 1980 there was a sudden rise of between 0.5 °C and 2 °C in all three trends, with the largest being found in Botswana. This does not correlate with accepted theories on global warming (the rises in temperature are too large and too sudden, and do not correlate with rises in atmospheric carbon dioxide), and so the exact origin of these rises appears to be unexplained.

viii) Europe

Fig. 45.7: The temperature trend for Europe since 1700. The best fit is applied to the interval 1731-1980 and has a positive gradient of +0.10 ± 0.04 °C per century.

In Post 44 I used the 109 longest temperature records to determine the temperature trend in Europe since 1700. The resulting data suggests that temperatures were stable from 1700 to 1980 (they rose by less than 0.25 °C), and then rose suddenly by about 0.8 °C after 1986. The reason for this change is unclear, but one possibility is that it has occurred due to a significant improvement in air quality that reduced the amount of particulates in the atmosphere. These particulates, that may have been present in earlier years, could have induced a cooling that compensated for the underlying warming trend. Once removed, the temperature then rebounded. Even if this is true, it suggests a maximum warming of about 1 °C since 1700, much of which could be the result of direct anthropogenic surface heating (DASH) as discussed in Post 14. In countries such as Belgium and the Netherlands the temperature rise is even less than that expected from such surface heating. It is also much less than that expected from an enhanced Greenhouse Effect due to increasing carbon dioxide levels in the atmosphere (i.e. about 1.5 °C in the Northern Hemisphere since 1910). In fact the total temperature rise should exceed 2.5 °C. So here is the BIG question? Where has all that missing temperature rise gone?

Wednesday, June 24, 2020

16. The story so far

The main purpose of this blog has been to analyse the physics behind climate change, and then to compare what the basic physics and raw data are indicating with what the climate scientists are saying. These are the results so far.

Post 7 looked at the temperature trend in New Zealand and found that the overall mean temperature actually declined until 1940 before increasing slightly up to the present day. The overall temperature change was a slight rise, but only amounting to about 0.25 °C since the mid 19th century. This is much less than the 1 °C rise climate scientists claim.

Post 8 examined the temperature trend in New Zealand in more detail and found that the breakpoint adjustments made to the data by Berkeley Earth, that were intended to correct for data flaws, actually added more warming to the trend than was in the original data.

Post 9 looked at the noise spectrum of the New Zealand data and found evidence of self-similarity and scaling behaviour with a fractal dimension of about 0.25. This implies that long-term temperature records over several thousands of years should still see fluctuations between the average temperature each century of over 0.5 °C, even without human intervention. In other words, a temperature rise (or fall) of at least up to 1 °C over a century is likely to be fairly common over time, and perfectly natural.

Post 10 looked at the impact of Berkeley Earth's breakpoint adjustments on the scaling behaviour of the temperature records and found that they had a negative impact. In other words the integrity of the data appeared to decline rather than improve after the adjustments were made.

Post 11 looked at the degree of correlation between pairs of temperature records in New Zealand as a function of their distance apart. For the original data a strong linear negative trend was observed for the maximum possible correlation between station pairs over distances up to 3000 km. But again the effect of Berkeley Earth's breakpoint adjustments to the data was a negative one. This trend became less detectable after the adjustments had been made. The one-year and five-year moving average smoothed data did become more highly correlated though.

After analysing the physics that dictate how the Sun and the Earth's atmosphere interact to set the Earth's surface temperature in Post 13, I then explored the implications of direct heating or energy liberation by humans at the Earth's surface in Post 14. Calculations of this direct anthropogenic surface heating (DASH) showed that while human energy use only contributed an average increase of 0.013 °C to the current overall global temperature, this energy use was highly concentrated. It is practically zero over the oceans and the poles, but in the USA it leads to an average increase of almost 0.2 °C. This rises to 0.3 °C in Texas and 0.5 °C in Pennsylvania. Yet in Europe the increases are typically even greater. In England the increase is almost 0.7 °C, and in the Benelux countries almost 1.0 °C. Perhaps more significantly for our understanding of retreating glaciers, the mean temperature rise from this effect for all the alpine countries is at least 0.3 °C.

Finally in Post 15 I looked at the energy requirements for sea level rise (SLR). Recent papers have claimed that sea levels are rising by up to 3.5 mm per year while NOAA/NASA satellite data puts the rise at 3.1 mm per year. These values are non-trivial but are still a long way short of the rate needed to cause serious environmental problems over the next 100 years.

In upcoming posts I will examine more of the global temperature data. But given what I have discovered so far, it would be a surprise if the results were found to be as clear cut as climate scientists claim. Contrary to what many claim, the science is not settled, and the data is open to many interpretations. That is not to say that everything is hunky dory though. Far from it.

Tuesday, June 9, 2020

11. Correlation between station temperature records

Central to statistical processes such as homogenization and breakpoint adjustment that I have mentioned in previous posts is the concept of correlation. In climate science it is used to help compare temperature records and generate regional trends, expectations or benchmarks against which the accuracy of local station records can be judged or measured (in theory).
A correlation coefficient is a mathematical measure of the degree by which two sets of data move in the same way. To put it simply, if you have two sets of data that you can plot as a scatter graph, if the points on the graph follow a straight line then the two sets of data are perfectly correlated. If the line of points slopes upwards then the correlation coefficient, ρ, will be +1.0, and if it slopes down then ρ = -1.0 . This rarely happens, though. Instead the points are usually scattered about a line or direction, and the more the points diverge from the line of perfect correlation (or best fit, because in that instance the two functions will be the same) the lower the correlation coefficient will be in magnitude.

Correlation coefficients are useful because the two sets of data that they compare don’t need to be of similar magnitudes, or even have the same units. It is the relative change of each that is measured and compared. And they are used a lot in climate science, particularly to compare temperature records.

As I pointed out in the last post, most climate groups use correlation coefficients to construct expected regional temperature trends against which individual temperature records are compared for the purpose of quality control (i.e. identifying measurement errors, station moves etc.). The assumption is that weather stations that are close together should be more strongly correlated than those that are further apart. But the questions that interest me are these.

(i) By how much does the degree of correlation change with increasing station separation?

(ii) What is be the impact of data smoothing on the correlation coefficient?

(iii) What effects do breakpoint adjustments have on the correlation coefficient?

The first of these questions is significant because it will determine the relative weighting of each record in the regional expectation trend, and more importantly indicate how independent these stations really are from one another, or to what degree they are coupled. The second question was initially included as an curiosity, but now has greater significance given the scaling behaviour of the Berkeley Earth adjusted data highlighted in the last post. The third question is about testing the quality of Berkeley Earth adjusted data, and in particular the validity of breakpoint adjustments.

In order to answer these questions I have calculated the correlation coefficients between most combinations of the 50 New Zealand stations that have more than 180 months of temperature data. This results in a potential 1275 pairs. However, only those pairs of records where the overlap of data within a specified time-frame exceeded a certain threshold have been included. Typically, this threshold is a minimum overlap of 200 months between 1971 and 2010. As a result 660 pairs of records qualified for this analysis (or 52%). In order to determine the temperature anomaly for each set of data, the monthly reference temperatures (MRTs) were calculated for the period 1981-2000. It was found that having consistency in the choice of MRT period for all station records was a major factor in minimizing the spread of values for the correlation coefficient.

Once the qualifying stations had been determined, the correlation coefficient for each pair of stations was calculated. There are a number of different types of correlation coefficient that are widely used. I have settled for the most commonly used variant, Karl Pearson's product moment correlation coefficient (PMCC).

Fig. 11.1: Correlation (PMCC) between raw temperature records for the period 1971-2010 for all stations in New Zealand with a minimum overlap of 200 months.

The data used in the correlation calculation was the anomaly data, not the raw temperature data. The main reason for this is that the raw temperature data is dominated by seasonal oscillations (the MRTs) that are far bigger than either the underlying trend or the monthly anomalies. As a result, applying a correlation test to the raw temperature data will yield consistently large positive correlations of more than 0.8 (as shown above in Fig. 11.1). This highlights one of the difficulties with correlation coefficients; while they seem quantitative, the interpretation of their values can be very subjective or relative. In some instances a correlation coefficient of 0.7 may be very good, in others like that in Fig. 11.1 above, it may be quite poor. Ideally, the maximum range for your input variable (in this case the station separation) should produce a correspondingly large change in correlation coefficient. While this is not true for raw temperature data, it most certainly is true for the anomaly data (see Fig. 11.2 below).

Fig. 11.2: Correlation (PMCC) for the period 1971-2010 between temperature anomalies for all stations in New Zealand with a minimum overlap of 200 months.

The data in Fig. 11.2 is split into two columns. On the left are plots of the PMCC versus station separation for the original anomaly data, on the right are the same plots but for Berkeley Earth adjusted data. In each case three sets of data are compared; (i) the monthly anomaly, (ii) the 12 month moving average of the anomaly, and (iii) the 5 year moving average of the anomaly.

If we first consider the data in Fig. 11.1(a) we see that the PMCC for the monthly anomaly data exhibits a clear downward trend with increasing station separation. It is also possible to identify a boundary line above which almost no data points are present. This boundary line for the maximum observed PMCC at that station separation (denoted as ρ_max) has the following approximate functional dependence on the station separation d in kilometres,

(11.1)

It should be noted that not only is this limit on correlation coefficients seen for New Zealand data, but a near identical trend is also seen for data from the Antarctic, thereby suggesting this behaviour may be more universal. This boundary line is illustrated graphically below in Fig. 11.3 along with the monthly anomaly data from Fig. 11.2(a).

Fig. 11.3: Data from Fig. 11.1(a) with maximum trend line (in red).

When it comes to data smoothing ( see Fig. 11.2(b) and Fig. 1.2(c) ), it appears this has two influences on the correlation coefficient. First it reduces the gradient of the boundary line, and second it increases the random scatter of points below this line. The amount by which the slope of the boundary line changes is broadly (but not exactly) consistent with the scaling behaviour outlined in Post 9. As for the increased scatter, this at first glance appears counter-intuitive. One might instead expect the smoothing to increase the degree of correlation by removing fluctuations to reveal a regional trend that is the more or less the same on all temperature records. But remember, we are probably dealing with a fractal here. Just like the Julia set, it may be that the smaller the level of detail, the more dissimilar some adjacent locations can be.

So what about the Berkeley Earth adjusted anomaly data? The 12 month moving average data in Fig. 11.2(e) clearly has a stronger trend and less scatter than the two other data sets in Fig. 11.2(d) and Fig. 11.2(f). This is very different to what is seen with the unadjusted data. In addition the adjusted monthly anomaly data in Fig. 11.2(d) has more scatter than the unadjusted monthly anomaly data. On this evidence the breakpoint adjustment has (again) made the unsmoothed data worse, not better. Yet the 12 month moving average data in Fig. 11.2(e) appears to be the "best" data of all: it has a higher mean correlation coefficient, less scatter and is generally closer to the boundary line. Why?

Fig. 11.4: Comparison of 12 month smoothed data from Paraparaumu aerodrome with regional expectation.

Well the answer may lie in Fig. 11.4 above. This is a comparison of the 12 month moving average data and the regional expectation for the station at Paraparaumu aerodrome taken from the Berkeley Earth site. This graph implies that it may be the 12 month moving average data that is used to construct the regional expectation (and hence the breakpoint adjustment) and not the monthly anomaly data. So, if the breakpoint adjustments are optimized to match the 12 month moving average data to the regional expectation, it is perhaps hardly a surprise that these datasets produce the highest correlation coefficients between stations. The surprise is that the improvement is not replicated in either the adjusted monthly anomaly data or other moving average data, which rather undermines the credibility of the whole breakpoint adjustment process.

Fig. 11.5: Square of the correlation coefficients in Fig. 11.2(a).

One of the purposes of the correlation coefficients is to quantify the weighting coefficients that are used to combine station records into regional expectations (see Fig. 11.4). This involves squaring the correlation coefficients to generate the weighting coefficient. The effect of squaring these coefficients is shown for the original monthly anomalies in Fig. 11.5 above and for the 12 month moving average of adjusted data in Fig. 11.6 below. In Fig. 11.5 only 29% had weightings of more than 0.5, while for the adjusted data in Fig. 11.6 it is significantly larger at 56%. This again has implications for how the regional expectation trends are constructed, and how representative they really are. A regional expectation based on the original monthly anomalies in Fig. 11.5 will rely on fewer datasets that are located closer to the target station. In Fig. 11.5 all data points with ρ2 > 0.5 represent station pairs that are less than 900 km apart. In Fig. 11.6 the equivalent distance is 1500 km.

Fig. 11.6: Square of the correlation coefficients in Fig. 11.2(e).

Finally, I analysed the long stations of New Zealand in isolation. This data is important for two reasons. First, these are the only stations with a significant amount of data that precedes 1940. That means they are essential for the construction of the regional expectation prior to 1940. The second reason is that they are fairly evenly spread across New Zealand. While this is a strength in that it means that the country is evenly covered with temperature data, on the down side it means that these stations are all (bar one pair) at least 200 km apart. I find it hard to reconcile that with the assertion that these stations could effectively error check each other for bad data, and that the resulting breakpoint adjustments that they would produce could be considered reliable. For supporting corroboration I would look at places like Greenland where the levels of snowfall, and the degree of glacier melt, have very local rather than regional behaviours.

Perhaps unsurprisingly then, the data in Fig. 11.7 below has a number of similarities with the data in Fig. 11.2, namely the effect of smoothing on the scatter of unadjusted data, and the effect of distance on the maximum correlation (as outlined in Eq. 11.1). It also has the same deficiencies; the totally random nature of the adjusted monthly data in Fig. 11.7(d), and adjusted anomalies that appear less reliable than their unadjusted counterparts.

Fig. 11.7: Correlation (PMCC) for the period 1854-2013 between temperature anomalies for long stations in New Zealand with a minimum overlap of 1080 months.

So, what of the three questions I asked at the start? Well the major breakthrough result from this analysis has been the discovery of the dependence of the correlation coefficient on station separation for the monthly anomalies as quantified in Eq. 11.1. What was surprising was the absence of any similar structure to the adjust monthly anomalies.

The impact of smoothing on the PMCC has been conflicting. The adjusted data has followed an expected trend, but possibly for the wrong reason (see Fig. 11.4). The unadjusted data still requires a degree of explanation, particularly the increasing randomness. While the gradient of the boundary line in Fig. 11.3 reduces (in magnitude) and the line moves to higher correlations as expected, the data below becomes more random. Is this because of fractal structure?

As for breakpoint adjustments, there was little of a positive slant found here to support their use. Their net impact appears to have been to decouple local station records and decrease overall levels of correlation. It is, therefore, difficult to reconcile the reality of their impact with the official viewpoint that they are supposedly correcting for data imperfections.

Saturday, May 30, 2020

9. Fooled by randomness

Is global warming real? That is probably a justifiable question given what I revealed in the last post about breakpoint alignment. But what I am going to demonstrate here and over the next two or three posts should also make you question everything you think you know about climate change. The first topic I am going to explore is a concept that most physicists and mathematicians are all too familiar with, but which appears to be totally off the radar of climate scientists: chaos theory and fractal geometry.

Fig. 9.1: Record 1.

First a test. Look at the dataset above (Fig. 9.1) and the one below (Fig. 9.2). Can you tell which one is a real set of temperature data and which one is fake?

Fig. 9.2: Record 2.

Okay, so actually it was a trick question because they are both real sets of data. In fact they are both from the same set of station data, and they are partially from the same time period as well, but there is clearly a difference. The difference is that the data in Fig. 9.1 above is only a small part of the actual temperature record but the data from Fig. 9.2 is from the entire record. The data in Fig. 9.1 is taken from the Christchurch station (Berkeley Earth ID - 157045) and is monthly data for the period 1974 - 1987. The data in Fig. 9.2 is from the same record but for the time interval 1864 - 2013: it has also been smoothed with a 12 month moving average. Yet they look the very similar in terms of the frequency and height of their fluctuations - why? Well, what you are seeing here is an example of self-similarity or fractal behaviour. The temperature record for Christchurch is a one-dimensional fractal, and so for that matter is every other temperature record.

Self-similarity is common in nature. You see it everywhere from fern leaves and cauliflowers to clouds and snowflakes. It is observed when you magnify some objects and look at them in greater detail, only to find, to your surprise, that the detail looks just like a smaller version of the original object. This is known as self-similarity: the object looks like itself but in microcosm. It is also an example of scaling behaviour. There is usually a fixed size ratio between the original and the smaller copies from which it is made.

In order to make the smoothed data in Fig. 9.2 look similar to the original data in Fig. 9.1 two scaling adjustments were made. First the time scale on the horizontal axis in Fig. 9.2 was shrunk by a factor of twelve. This is to compensate for the smoothing process which effectively combines twelve points into one. The second was to scale up the temperature axis in Fig. 9.2 by a factor 12^0.275. The reason for the power of 0.275 will become apparent shortly, but it is important as it has profound implications for the noise level we see in temperature records over long time periods (i.e. centuries).

To demonstrate the scaling behaviour of the temperature record we shall do the following. First we smooth the data with a moving average of length say two points and then calculate the standard deviation of the smoothed data. Then we repeat this for the original data, but with a different number of data points in the moving average and again calculate the standard deviation of the new smoothed data. After doing this for six or seven different moving averages we plot a graph of the logarithm of the standard deviation versus log(N) where N is the number of points used each time for the moving average. The result is shown below in Fig. 9.3.

Fig. 9.3: Plot of the standard deviation of the smoothed anomaly data against the smoothing interval N for temperature data from Christchurch (1864-2013).

The important feature of the graph in Fig. 9.3 is that the data lies on an almost perfect straight line of slope -0.275 (remember that number)? I have to confess that even I was shocked by how good the fitting was when I first saw it, particularly given how imperfect temperature data is supposed to be. What this graph is illustrating is that as we smooth the data by a factor N, the noise level is reducing by a factor N^-0.275. But is this reproducible for other data? Well the answer appears to be, yes.

Fig. 9.4: Plot of the standard deviation of the smoothed anomaly data against the smoothing interval N for temperature data from Auckland (1853-2013).

The graph above (Fig. 9.4) shows the same scaling behaviour for the station at Auckland (Berkeley Earth ID = 157062) while the one below (Fig. 9.5) illustrates it for the station at Wellington (Berkeley Earth ID = 18625). The gradients of the best fit lines (i.e. the power law index in each case) are -0.248 and -0.235 respectively. This suggests that the real value is probably about -0.25.

Fig. 9.5: Plot of the standard deviation of the smoothed anomaly data against the smoothing interval N for temperature data from Wellington (1863-2005).

But it is the implications of this that are profound. Because the data is such a perfect fit in all three cases, we can extrapolate to longer smoothing operations such as one hundred years. That corresponds to a scaling term of 1200 (because it is equal to 1200 months and thus is 1200 greater in period than the original data) and a noise reduction of 1200^0.25 = 5.89. In other words, the noise level on the underlying one hundred year moving average is expected to be about six times less than for the monthly data. This sounds like a lot but the monthly data for Christchurch has a noise range of up to 5 °C (see Fig. 9.6 below), so this implies that the noise range on a 100 year trend will still be almost 1 °C. Now if that doesn’t grab your attention, I have to wonder what will? Because it implies that the anthropogenic global warming (AGW) that climate scientists think they are measuring is probably all just low frequency noise resulting from the random fluctuations of a chaotic non-linear system.

Fig. 9.6: The temperature anomaly data from Christchurch (1864-2013) plus a 5-year smoothing average.

What we are seeing here is a manifestation of the butterfly effect which, put simply, says that there is no immediate causal link between some current phenomena such as the temperature fluctuations we see today and current global events. This is because the fluctuations are actually the result of dynamic effects that played out long ago but which are only now becoming visible.

Fig. 9.7: Typical mean station temperatures for each decade over time.

To illustrate the potential of this scaling behaviour further we can use it to make other predictions. Because the temperature record exhibits self-similarity on all timescales, it must do so for long timescales as well, such as centuries. So we can predict what the average temperature over hundreds of years might look like (qualitatively but not precisely) just by taking the monthly data in Fig. 9.6, expanding the time axis by a factor of 120 and shrinking the amplitude of the fluctuations by a factor of 120^0.25 = 3.310. The result is shown in Fig. 9.7 above. Because of the scaling by a factor of 120, each monthly data point in Fig. 9.6 becomes a decade in Fig. 9.7. The data in Fig. 9.7 thus indicates that the average temperature for each decade can typically fluctuate by about ±0.5 °C or more over the course of time.

Fig. 9.8: Typical mean station temperatures over 100 years over time.

Then, if we smooth the data in Fig. 9.7, we can determine the typical fluctuations over even longer timescales. So, smoothing with a ten point moving average will yield the changes in mean temperature for 100 year intervals as shown in the graph above (Fig. 9.8). This again shows large fluctuations (up to 0.5 °C) over large time intervals. But what we are really interested in from a practical viewpoint is the range of possible fluctuations over 100 years as this corresponds to the timeframe most quoted by climate scientists.

To examine this we can subtract from the value at current time t the equivalent value from one hundred years previous, i.e. ∆T = T(t) - T(t-100).

Fig. 9.9: Typical change in the 100-year mean temperature for a time difference of 100 years.

So, as an example we may wish to look at the change in mean temperature from different epochs, say from one century to the next. Well the data in Fig. 9.9 shows just that. Each data point represents difference between the mean temperature over a hundred years at that point in time with the same value for a hundred years previous. Despite the large averaging periods we still see significant temperature changes of ± 0.25 °C or more. However, if we compare decades in different centuries it is even more dramatic.

For example, Fig. 9.10 below predicts the range of changes in the average decadal temperatures from one century to the next, in other words, the difference between the 10-year mean temperature at a given time t and the equivalent decadal mean for a time one hundred years previous. What Fig. 9.10 indicates is that there is a high probability that the mean temperature in the 1990s could be 0.5 °C higher or lower that the mean temperature in the 1890s, and this is just as a consequence of low frequency noise.

Fig. 9.10: Typical change in mean decadal temperature for a time difference of 100 years.

So why have climate scientists not realized all this? Maybe it's because their cadre comprise more geography graduates and marine biologists than people with PhDs in quantum physics. But perhaps it is also due to the unique behaviour of the noise power spectrum.

If the noise in the temperature record behaved like white noise it would have a power spectrum that is independent of frequency, ω. If we define P(ω) to be the total power in the noise below a frequency, ω, then the power spectrum is the differential of P(ω). For white noise this is expected to be constant across all frequencies up to a cutoff frequency ω_o.

(9.1)

This in turn means that P(ω) has the following linear form up to the cutoff frequency ω_o.

P(ω) = aω

(9.2)

where a is a constant. The cutoff frequency is the maximum frequency in the Fourier spectrum of the data and is set by the inverse of the temporal spacing of the data points. If the data points are closer together then the cutoff frequency will be higher. Graphically P(ω) looks like the plot shown below in Fig. 9.11, a continuous horizontal line up to the cutoff frequency ω_o.

Fig. 9.11: The frequency dependent power function P(ω) for white noise.

The effect of smoothing with a moving average of N points is to effectively reduce the cutoff frequency by a factor of N because you are merging N points into one. And because the noise power is proportional to the noise intensity, which is proportional to the square of the noise amplitude, this means that the noise amplitude (as well as the standard deviation of the noise) will reduce by a factor equal to √N when you smooth by a factor of N.

For a 100-year smoothing the scaling factor compared to a monthly average is 1200, and so the noise will therefore reduce by a factor of 1200^0.5 = 34.64 . That means the temperature fluctuations will be typically less than 0.1 °C. This is probably why climate scientists believe that the long term noise will always be smoothed or averaged out, and therefore why any features that remain in the temperature trend must be "real". The problem is, this does not appear to be true.

Instead the standard deviation varies as N^-0.25. So the intensity of the noise varies as N^-0.5 and P(ω) will increase as √N. It therefore follows that the power spectrum is not independent of frequency as is the case for white noise, but instead varies with frequency as

(9.3)

and P(ω) will look like the curve shown in Fig. 9.12 below.

Fig. 9.12: The frequency dependent power function P(ω) for temperature data.

The net result is that the random fluctuations in temperature seen over timescales of 100 years or more are up to six times greater in magnitude than most climate scientists probably think they will be. So the clear conclusions is this: most of what you see in the smoothed and averaged temperature data is noise not systemic change (i.e. warming). Except, unfortunately, most people tend to see what they want to see.

Thursday, May 28, 2020

8. New Zealand - trend due to long and medium stations

In the last post I looked at the long weather station records from New Zealand (i.e. those with over 1200 months of data) and showed how they could be combined to give a temperature trend for climate change using the theory outlined in Post 5. The result was a trend line that looked nothing like the ones Berkeley Earth and other climate science groups claim to have uncovered (compare Fig. 7.6 with Fig. 7.2 in Post 7).

Some may argue that part of the reason for this difference lies in the number of stations used (only ten), or the lack of data from the much larger group of seventeen medium length stations (with 401-1200 months of data each) that could have been utilized. To see if data from these stations does make a significant difference I have repeated the analysis process from the previous post, but with the seventeen medium length stations included alongside the original ten long stations.

The first problem we have, though, is that most of these medium stations have data that only stretches back in time to about 1960 and most of their data is post 1970. That means we cannot use the 1961-1990 period to determine the monthly reference temperatures (MRTs) that are needed to remove the seasonal variations (for an explanation of the MRT see Post 4). So instead I have chosen to use the period 1981-2000, which while not as long, is a period for which all 27 stations have at least 80% data coverage. After calculating the anomalies for each station, as before, the trend was determined by using the averaging method represented by Eq. 5.11 in Post 5. This gives the trend profile shown in Fig. 8.1 below.

Fig. 8.1: Average warming trend of for long and medium stations in New Zealand.

This trend is virtually identical to the one presented in the last post (see Fig. 7.6), which suggests that the additional data makes little difference to the slope, although there is a slight reduction in the noise level post-1970. The best fit line indicates a warming trend of only 0.27 ± 0.04 °C per century, again almost identical to that from the long stations alone (0.29 ± 0.04 °C per century). This also suggests that the choice of time period for the MRT has little effect as well. Yet if we look at the Berkeley Earth adjusted data we get a different picture.

Fig. 8.2: Average warming trend of for long and medium stations in New Zealand using Berkeley Earth adjusted data.

If we use Berkeley Earth adjusted data for both long and medium stations to determine the local warming trend for New Zealand we get the data in Fig. 8.2 above. The best fit line shows a significant +0.60 ± 0.04 °C per century upward slope.

Fig. 8.3: Smoothed warming trends of for long and medium stations in New Zealand using Berkeley Earth adjusted data.

This is even more evident if we look at the 1-year and 5-year moving averages (see Fig. 8.3 above). But if we look at the real data in Fig. 8.1 and plot the 1-year and 5-year moving averages (see Fig. 8.4 below) we again get a different trend entirely from that in Fig. 8.3.

Fig. 8.4: Smoothed warming trends of for long and medium stations in New Zealand using original data.

The question is, why are the data in Fig. 8.3 and Fig. 8.5 so different? The answer is breakpoint alignment.

Fig. 8.5: Difference (Berkeley Earth adjusted data - original data) in the smoothed warming trends for long and medium stations in New Zealand.

If we subtract the data in Fig. 8.4 from that in Fig. 8.3 we get the curves in Fig. 8.5 above. This data is the result of corrections that have been imparted into the data by Berkeley Earth, via a technique called breakpoint alignment or breakpoint adjustment, supposedly to correct for systematic data errors, such as those described above: station moves, changes in instruments and changes to the time of day of measurement. These adjustment are in effect attempting to identify systematic errors between temperature records or within temperature records, and compensate for them.

Yet these changes by Berkeley Earth are clearly not neutral. They do not merely iron out undulations in order to reveal the trend more clearly, they actually add to the trend. In this case these adjustments add 0.33 °C per century to the overall trend. That is more than the original trend in Fig. 8.1, and is why the gradient of the Berkeley Earth best fit line in Fig. 8.3 and Fig. 8.4 is more than double that for the original data in Fig. 8.1 and Fig. 8.4. This is why there is so much scepticism about global warming. Many people outside the climate science community do not trust the data or the analysis. And this is not just a problem with Berkeley Earth. All the major groups do it; it is just that Berkeley Earth are more transparent about it.

What the analysis here has shown is that having more data for recent epochs does not really improve the quality of data in the overall trend, or the confidence level of the conclusions that can be derived from that data. It is more important to have ten long temperature records than twenty (or even a hundred) short ones. Yet herein lies a paradox. The longer the temperature record, the less its quality is trusted by the climate scientists, and the more they seek to fragment if into shorter records via the use of breakpoints. We will see this more clearly later when I look in more detail at the Horlicks that is breakpoint alignment.

Addendum

Close inspection of Fig. 8.1 suggests that the spread of the data is greater before 1940 than it is thereafter. This is a consequence of the increased number of datasets that are used to calculate the trend in the latter half of the 20th century compared to the first half and the 19th century. The number of datasets involved in constructing the average temperature trend shown in Fig. 8.1 is indicated below in Fig. 8.6.

Fig. 8.6: The number of sets of station data included each month in the temperature trend for New Zealand.

In summary, there were between 5 and 11 datasets used to calculate the trend between 1870 and 1940, and up to 25 thereafter. Given that the standard deviation of the anomalies in most individual temperature records is approximately 1.0 °C, this implies that the standard deviation of the monthly data in the temperature trend should be about 60% greater before 1940 compared to 1980 and later. It also means the uncertainty in the mean trend will increase slightly as you go back in time, from about ±0.2 °C after 1980, to about ±0.35 °C before 1940. Nevertheless, in my view, this indicates that the temperature trend from 1860-1940 is almost as reliable as that for much later years (1960-2010) despite the reduced amount of data. For while the uncertainty before 1940 may be almost double its post 1960 value, it is still significantly less than the natural variation seen in the 5-year moving average temperature.

Tuesday, May 26, 2020

7. New Zealand - trend due to long stations SLIGHT WARMING

As I pointed out last time, New Zealand has some very good temperature data, or at least good in comparison to most other countries in the Southern Hemisphere. It has over 50 sets of station data, of which ten have over 1200 months of data, and about a dozen have more than 400 months of data, yet even that is not enough.

The stations with over 1200 months of data I shall denote as long stations, those with over 400 months I denote as medium stations. Their geographical locations are shown in Fig. 6.1 in the last post. To start with I shall analyse the long station records, (a) because there are a large number of them, and (b) because they are the records that will show the most discernible trend, yet even with many of these stations the trend can be ambiguous.

If we start with the individual stations, a typical temperature anomaly, i.e. the mean temperature each month minus the monthly reference temperature (MRT), is shown below together with its best fit line and a 5-year moving average. This is the data for Auckland (Albert Park). It dates back to 1853 and it is one of the oldest, longest and most complete records in New Zealand.

Fig. 7.1: Temperature anomaly for Albert Park, Auckland (1853-2013). The best fit line has a positive gradient of 0.18 ± 0.05 °C per century.

Two things are immediately apparent. The first is that the noise level on the anomaly data makes it difficult to discern the overall trend in the data even though the seasonal variation (the MRT) has been removed, and remember, the data represents the average temperature over a whole month not just a typical day. In fact the standard deviation of the data in Fig. 7.1 is 0.95 °C and still almost 25% of data lie outside this range.

The second is that, while the trend does become more discernible if a 5-year moving average is performed (the yellow curve), the trend is still not uniform, nor does it represent a single continuous rise or fall. In fact the temperature in 2013 appears to be no higher than in the 1850s and there is a clear sign of a longer term (150 year) oscillation.

Also shown in Fig. 7.1 is the best fit to the anomaly data. While this fit line has a positive slope of 0.18±0.05 °C per century it needs to be acknowledged that this is not due to a warming trend, but is purely as a consequence of the 150 year oscillation. If you look back at Fig. 4.7 you will recall that the best fit to a full sine wave always has a positive slope. What is clear, however, is that the trend shown in Fig. 7.1 bears no resemblance to the one expected for the region, as illustrated below and on the Berkeley Earth site, either in shape or magnitude.

Fig. 7.2: Berkeley Earth warming trend for New Zealand (1853-2013).

So what about the other nine stations? Well the next three longest are shown in Fig. 7.3. These are the records from Dunedin (ID 18603), Christchurch (157045) and Wellington (18625).

Fig. 7.3: Smoothed temperature anomalies and best fit lines for three stations.

Here the results are even more contradictory. In Fig. 7.3 each set of data is plotted as in the form of the 5-year moving average together with the best fit line. The legend on the graph indicates the slope of each best fit line in degrees Celsius per century, therefore the data for Dunedin-Musselburgh (Berkeley Earth ID = 18603) has a warming trend of 0.64°C per century, while that for Wellington-Kelburn (Berkeley Earth ID = 18625) has a cooling trend of -0.25 °C per century. That for Christchurch (Berkeley Earth ID = 157045) is slightly warming. Of the three datasets, those for Dunedin and Christchurch do appear to offer a passing resemblance to the Berkeley Earth trend in Fig. 7.2, although neither exhibits a temperature gradient as high as that in Fig. 7.2, while Christchurch and Wellington both appear to have a 150 year oscillation that is responsible for most of the slope in the best fit.

Fig. 7.4: Smoothed temperature anomalies and best fit lines for three stations.

If we consider the three next longest records we observe a common theme (see Fig. 7.4). After 1920 there is a distinct warming trend (as was seen in most of the previous data above) but before 1920 much of the important data is missing. Given what we have seen in previous data, it is reasonable to suppose that this data would be of a higher temperature than that which is present, and therefore that the warming trends indicated by the best fit lines in Fig. 7.4 are over-estimates.

Fig. 7.5: Smoothed temperature anomalies and best fit lines for three stations.

For the remaining stations the lack of early data becomes an even bigger problem (see Fig. 7.5), and while a trend post-1940 can be discerned, the early data is highly fragmented. Nevertheless, the last two datasets in Fig. 7.5 both have a peak at about 1890 which is clearly evident on all the data in Fig. 7.1 and Fig. 7.3, and which is at a comparable height relative to the data around 1960 in each case. That suggests that most of this data is consistent and sound.

As I pointed out in post 5, if we wish to derive a regional trend such as that shown in Fig. 7.2, all we need to do is average the anomalies (provided all the data has been processed in a consistent manner and is reliable). When we do this for the ten stations described above, we get the dataset illustrated below in Fig. 7.6.

Fig. 7.6: The warming trend for New Zealand (1853-2013) based on the averaging of long station anomalies. The best fit line has a positive gradient of 0.29 ± 0.04 °C per century.

The mean temperature change or anomaly plotted in Fig. 7.6 has a slight upward trend and its best fit line has a gradient of 0.29±0.04 °C per century. However, this variation is clearly in two parts. From 1860 to 1940 the trend is clearly downwards, while from 1940 to 2000 it is upwards.

What is also clear is that the temperature trends in Fig. 7.6 are very similar to those shown for Auckland in Fig. 7.1 (and actually most of the other individual station datasets) but very dissimilar to that advanced by Berkeley Earth in Fig. 7.2. The question is why?

Well, there are two reasons, and they are both to do with how the temperature data is handled and processed. The graphs I have presented here in Fig. 7.1 and Figs. 7.3-7.6 all use the original temperature data as is. I first calculate the MRT following the method outlined here and subtract it from the original data to obtain the anomaly without the problem of the large seasonal variations. That is a standard procedure that all climate science groups should do. The time base I used for the calculation of the MRT was 1961-1990 for reasons outlined here. Again, this time frame appears to be fairly standard. However, despite this my MRT values differ slightly from those of Berkeley Earth. The reason for this I have not discovered yet, but it may be that old favourite of climate scientists, homogenization, or it may be a different choice of time frame. Whatever the reason, fortunately it makes little difference. If you sum the Berkeley Earth anomalies the result is very similar as shown in below in Fig. 7.7.

Fig. 7.7: The warming trend for New Zealand (1853-2013) based on the averaging of the Berkeley Earth anomalies for long stations.

That, however, is where the similarity ends because Berkeley Earth then play their joker: breakpoint alignment. This is a mathematical device that is supposed to account for imperfections in the data due to human measurement error, changes in instruments, location moves and changes in the time of day when the measurements were made. How this is implemented I will discuss at a later date. What is important here is the net result and that is shown in Fig. 7.8.

Fig. 7.8: The warming trend for New Zealand (1853-2013) based on the averaging of the adjusted Berkeley Earth anomalies for long stations after breakpoint alignment.

The differences between Fig. 7.6 (or Fig. 7.7) and Fig. 7.8 are subtle but clear when you finally see it. In Fig. 7.6 there is a discontinuity or kink in the gradient of the general trend around 1940 and the slope is shallow. In Fig. 7.8 the slope is more uniform and steeper. The gradient of the best fit line has now more than doubled to 0.60 ± 0.04 °C per century and the temperature rise from 1860 to 2010 has gone from virtually zero in Fig. 7.6 to an impressive 0.9 °C. Now, if we smooth the data in Fig. 7.8 using a 12-month and a 10-year moving average, we get the curves shown in Fig. 7.9 below which look very similar to the Berkeley Earth summary trend in Fig. 7.2.

Fig. 7.9: The smoothed warming trend for New Zealand (1853-2013) based on the averaging of the adjusted Berkeley Earth anomalies for long stations after breakpoint alignment.

This validates our averaging process, but the problem is that Fig. 7.9 bears very little resemblance to the original data in Fig. 7.6. This difference cannot be due to a difference in the averaging process, otherwise Fig. 7.9 would not resemble Fig. 7.2 so closely. That only leaves homogenization and breakpoint adjustments as the possible causes of the difference. The equally worrying question is, why are these adjustments being made? Those reasons may become apparent as we look at more of the global temperature data.