Tuesday, June 9, 2020

11. Correlation between station temperature records

Central to statistical processes such as homogenization and breakpoint adjustment that I have mentioned in previous posts is the concept of correlation. In climate science it is used to help compare temperature records and generate regional trends, expectations or benchmarks against which the accuracy of local station records can be judged or measured (in theory).
A correlation coefficient is a mathematical measure of the degree by which two sets of data move in the same way. To put it simply, if you have two sets of data that you can plot as a scatter graph, if the points on the graph follow a straight line then the two sets of data are perfectly correlated. If the line of points slopes upwards then the correlation coefficient, ρ, will be +1.0, and if it slopes down then ρ = -1.0 . This rarely happens, though. Instead the points are usually scattered about a line or direction, and the more the points diverge from the line of perfect correlation (or best fit, because in that instance the two functions will be the same) the lower the correlation coefficient will be in magnitude.

Correlation coefficients are useful because the two sets of data that they compare don’t need to be of similar magnitudes, or even have the same units. It is the relative change of each that is measured and compared. And they are used a lot in climate science, particularly to compare temperature records.

As I pointed out in the last post, most climate groups use correlation coefficients to construct expected regional temperature trends against which individual temperature records are compared for the purpose of quality control (i.e. identifying measurement errors, station moves etc.). The assumption is that weather stations that are close together should be more strongly correlated than those that are further apart. But the questions that interest me are these.

(i) By how much does the degree of correlation change with increasing station separation?

(ii) What is be the impact of data smoothing on the correlation coefficient?

(iii) What effects do breakpoint adjustments have on the correlation coefficient?

The first of these questions is significant because it will determine the relative weighting of each record in the regional expectation trend, and more importantly indicate how independent these stations really are from one another, or to what degree they are coupled. The second question was initially included as an curiosity, but now has greater significance given the scaling behaviour of the Berkeley Earth adjusted data highlighted in the last post. The third question is about testing the quality of Berkeley Earth adjusted data, and in particular the validity of breakpoint adjustments.

In order to answer these questions I have calculated the correlation coefficients between most combinations of the 50 New Zealand stations that have more than 180 months of temperature data. This results in a potential 1275 pairs. However, only those pairs of records where the overlap of data within a specified time-frame exceeded a certain threshold have been included. Typically, this threshold is a minimum overlap of 200 months between 1971 and 2010. As a result 660 pairs of records qualified for this analysis (or 52%). In order to determine the temperature anomaly for each set of data, the monthly reference temperatures (MRTs) were calculated for the period 1981-2000. It was found that having consistency in the choice of MRT period for all station records was a major factor in minimizing the spread of values for the correlation coefficient.

Once the qualifying stations had been determined, the correlation coefficient for each pair of stations was calculated. There are a number of different types of correlation coefficient that are widely used. I have settled for the most commonly used variant, Karl Pearson's product moment correlation coefficient (PMCC).



 Fig. 11.1: Correlation (PMCC) between raw temperature records for the period 1971-2010 for all stations in New Zealand with a minimum overlap of 200 months.

The data used in the correlation calculation was the anomaly data, not the raw temperature data. The main reason for this is that the raw temperature data is dominated by seasonal oscillations (the MRTs) that are far bigger than either the underlying trend or the monthly anomalies. As a result, applying a correlation test to the raw temperature data will yield consistently large positive correlations of more than 0.8 (as shown above in Fig. 11.1). This highlights one of the difficulties with correlation coefficients; while they seem quantitative, the interpretation of their values can be very subjective or relative. In some instances a correlation coefficient of 0.7 may be very good, in others like that in Fig. 11.1 above, it may be quite poor. Ideally, the maximum range for your input variable (in this case the station separation) should produce a correspondingly large change in correlation coefficient. While this is not true for raw temperature data, it most certainly is true for the anomaly data (see Fig. 11.2 below).



Fig. 11.2: Correlation (PMCC) for the period 1971-2010 between temperature anomalies for all stations in New Zealand with a minimum overlap of 200 months.


The data in Fig. 11.2 is split into two columns. On the left are plots of the PMCC versus station separation for the original anomaly data, on the right are the same plots but for Berkeley Earth adjusted data. In each case three sets of data are compared; (i) the monthly anomaly, (ii) the 12 month moving average of the anomaly, and (iii) the 5 year moving average of the anomaly.

If we first consider the data in Fig. 11.1(a) we see that the PMCC for the monthly anomaly data exhibits a clear downward trend with increasing station separation. It is also possible to identify a boundary line above which almost no data points are present. This boundary line for the maximum observed PMCC at that station separation (denoted as ρmax) has the following approximate functional dependence on the station separation d in kilometres,

(11.1)

It should be noted that not only is this limit on correlation coefficients seen for New Zealand data, but a near identical trend is also seen for data from the Antarctic, thereby suggesting this behaviour may be more universal. This boundary line is illustrated graphically below in Fig. 11.3 along with the monthly anomaly data from Fig. 11.2(a).



Fig. 11.3: Data from Fig. 11.1(a) with maximum trend line (in red).


When it comes to data smoothing ( see Fig. 11.2(b) and Fig. 1.2(c) ), it appears this has two influences on the correlation coefficient. First it reduces the gradient of the boundary line, and second it increases the random scatter of points below this line. The amount by which the slope of the boundary line changes is broadly (but not exactly) consistent with the scaling behaviour outlined in Post 9. As for the increased scatter, this at first glance appears counter-intuitive. One might instead expect the smoothing to increase the degree of correlation by removing fluctuations to reveal a regional trend that is the more or less the same on all temperature records. But remember, we are probably dealing with a fractal here. Just like the Julia set, it may be that the smaller the level of detail, the more dissimilar some adjacent locations can be.

So what about the Berkeley Earth adjusted anomaly data? The 12 month moving average data in Fig. 11.2(e) clearly has a stronger trend and less scatter than the two other data sets in Fig. 11.2(d) and Fig. 11.2(f). This is very different to what is seen with the unadjusted data. In addition the adjusted monthly anomaly data in Fig. 11.2(d) has more scatter than the unadjusted monthly anomaly data. On this evidence the breakpoint adjustment has (again) made the unsmoothed data worse, not better. Yet the 12 month moving average data in Fig. 11.2(e) appears to be the "best" data of all: it has a higher mean correlation coefficient, less scatter and is generally closer to the boundary line. Why?



 Fig. 11.4: Comparison of 12 month smoothed data from Paraparaumu aerodrome with regional expectation.


Well the answer may lie in Fig. 11.4 above. This is a comparison of the 12 month moving average data and the regional expectation for the station at Paraparaumu aerodrome taken from the Berkeley Earth site. This graph implies that it may be the 12 month moving average data that is used to construct the regional expectation (and hence the breakpoint adjustment) and not the monthly anomaly data. So, if the breakpoint adjustments are optimized to match the 12 month moving average data to the regional expectation, it is perhaps hardly a surprise that these datasets produce the highest correlation coefficients between stations. The surprise is that the improvement is not replicated in either the adjusted monthly anomaly data or other moving average data, which rather undermines the credibility of the whole breakpoint adjustment process.



Fig. 11.5: Square of the correlation coefficients in Fig. 11.2(a).


One of the purposes of the correlation coefficients is to quantify the weighting coefficients that are used to combine station records into regional expectations (see Fig. 11.4). This involves squaring the correlation coefficients to generate the weighting coefficient. The effect of squaring these coefficients is shown for the original monthly anomalies in Fig. 11.5 above and for the 12 month moving average of adjusted data in Fig. 11.6 below. In Fig. 11.5 only 29% had weightings of more than 0.5, while for the adjusted data in Fig. 11.6 it is significantly larger at 56%. This again has implications for how the regional expectation trends are constructed, and how representative they really are. A regional expectation based on the original monthly anomalies in Fig. 11.5 will rely on fewer datasets that are located closer to the target station. In Fig. 11.5 all data points with ρ2 > 0.5 represent station pairs that are less than 900 km apart. In Fig. 11.6 the equivalent distance is 1500 km.




Fig. 11.6: Square of the correlation coefficients in Fig. 11.2(e).


Finally, I analysed the long stations of New Zealand in isolation. This data is important for two reasons. First, these are the only stations with a significant amount of data that precedes 1940. That means they are essential for the construction of the regional expectation prior to 1940. The second reason is that they are fairly evenly spread across New Zealand. While this is a strength in that it means that the country is evenly covered with temperature data, on the down side it means that these stations are all (bar one pair) at least 200 km apart. I find it hard to reconcile that with the assertion that these stations could effectively error check each other for bad data, and that the resulting breakpoint adjustments that they would produce could be considered reliable. For supporting corroboration I would look at places like Greenland where the levels of snowfall, and the degree of glacier melt, have very local rather than regional behaviours.

Perhaps unsurprisingly then, the data in Fig. 11.7 below has a number of similarities with the data in Fig. 11.2, namely the effect of smoothing on the scatter of unadjusted data, and the effect of distance on the maximum correlation (as outlined in Eq. 11.1). It also has the same deficiencies; the totally random nature of the adjusted monthly data in Fig. 11.7(d), and adjusted anomalies that appear less reliable than their unadjusted counterparts.



Fig. 11.7: Correlation (PMCC) for the period 1854-2013 between temperature anomalies for long stations in New Zealand with a minimum overlap of 1080 months.


So, what of the three questions I asked at the start? Well the major breakthrough result from this analysis has been the discovery of the dependence of the correlation coefficient on station separation for the monthly anomalies as quantified in Eq. 11.1. What was surprising was the absence of any similar structure to the adjust monthly anomalies.

The impact of smoothing on the PMCC has been conflicting. The adjusted data has followed an expected trend, but possibly for the wrong reason (see Fig. 11.4). The unadjusted data still requires a degree of explanation, particularly the increasing randomness. While the gradient of the boundary line in Fig. 11.3 reduces (in magnitude) and the line moves to higher correlations as expected, the data below becomes more random. Is this because of fractal structure?

As for breakpoint adjustments, there was little of a positive slant found here to support their use. Their net impact appears to have been to decouple local station records and decrease overall levels of correlation. It is, therefore, difficult to reconcile the reality of their impact with the official viewpoint that they are supposedly correcting for data imperfections.



No comments:

Post a Comment