Monday, January 9, 2012

Cherrypicker's guide to station trends

Cherrypicker's guide to station trends

A while ago I posted a cherrypicker's guide to global surface temperature trends. This was partly in response to an outburst of the popular activity of selecting short periods where the trend could be seen as small or even negative. The idea was to embed this in a set of displayed results where you could get a broader picture of where such trends might be found, and how significant the regions were

Another popular activity is selecting stations which fail to show recent warming. So I've presented here a different kind of plot which shows all the GHCN stations on a world map, with shading to indicate the trend over the last 30, 45 or 60 years (you can choose). It derives from the plot I posted for November temperatures. It takes advantage of HTML 5 linear color shading, and uses a triangular mesh. Each station has a color corresponding to its trend for the chosen period; although the colors can vary locally, the station neighborhood itself should have the correct color. Note that there is no spatial averaging (except for the shading); the individual station trends determine the coloring.

One result that I found interesting is that while there are patchwork regions, eg USA, there are also large regions with fairly uniform trends, particularly the ocean.

So here's the plot. It is interactive - you can rearrange it, magnify, show the stations, click to see station detail and numbers etc. Use Ctrl+. Ctrl- to get it the right size for your screen. The mechanics are explained below.










your browser does not support the canvas tag


Click on this map to orient the world plot.




Show Stations
Show Mesh
Magnification


Trend period
Station





















































How it works

More details here. The flat map at top right is your navigator. If you click a point in that, the sphere will rotate so that point appears in the centre. The buttons below allow modification. Set what you want, and press refresh. You can show stations, and the mesh, and magnify 2×, 4×, or 8× (by setting both). You can click again to unset (and press refresh). Then you can click in the sphere. At the bottom on the right, the nearest station name and anomaly will appear. You may want to have stations displayed here. The selection menu chooses the period; if you change the period, the plot changes without need for refresh.

How the trends are calculated.

Calculating station trends is not trivial, because of missing values and seasonal variation. Obviously missing winter values at one end will bias the trend. A reasonable fix is to subtract the mean for each month and work with anomalies. But the means don't correspond exactly to the same period. I used a weighted least squares method similar to that used in TempLS. The weighting simply has unit value for months with readings, zero without. The model is xmy  ~  Lm  +  Tr  Jmy where x are the station readings, L the monthly offsets, Tr the trend, and J a linear progression of months, in century units. The suffixes m for month, y for year. This is fitted by least squares. The resulting formula is similar to OLS. If we now say that J has zero values for missing x, and I is a similar vector which is 1 mostly, but zero where x is missing, then the OLS trend formula would be
Tr = f(x) / f(J)  where   f(x) = S(Jx) - S(J) S(x) / S(I)
where S represents summation over all month/year. Denoting Sy as the process of summing over individual months, and Sm summing the resulting 12 sums, the formula from the above fit is:
Tr = f(x) / f(J); f(x) = S(Jx) - Sm( Sy(J) Sy(x) / Sy(I))

Data issues.

The data sets associated with these plots are getting larger, so attention to download time is needed. I'd like to be able to use gzip compression, but neither of the data stores I currently use support that. I have to send the data as text - basically Javascript assigns - so I use various tricks to get the text size down. I multiply by 100, say, to get rid of decimal points. I usually convert data to differences to make the numbers smaller. In fact, the biggest data item here is the world map. But potentially another big one is the triangle mesh, and there I had to make some compromises. I don't want to send a different mesh for each trend period, but if I send a mesh with all points, then in any one period, some stations will be missing. I dealt with this by using an interpolated value for coloring purposes. That's OK - the shading otherwise uses linear interpolation. But this isn't quite linear, and could look odd at times, though I haven't noticed anything. The missing (interpolated) stations don't appear when stations are shown, and are not reported when station values are produced by clicking. Anyway, the datafile size is now about 360 Kb, which is manageable. I'm looking out for better ways because I have ambitions that involve transmitting the whole TempLS dataset.

Notes


You may see a few ocean stations appearing on land. The reason is that I artificially create ststions at the centre of each 4x4 deg cell. If there is enough sea in the cell for HADSST to report a SST value, that will be assigned to that central point, even if it is on land.


Note that the color scheme suggests cooling, but most of the blue range is actually positive (though smaller) trend.


Stations are included in the trend analysis if they have at least 80% of months reporting within the range.



11 comments:

  1. Is the US using GHCN only, or have you got USHCN in there too? They really should be used together, I think.

    ReplyDelete
  2. CE,
    I'm using GHCN only - I think all the GHCN stations are also USHCN. About ten years ago GHCN greatly reduced the number of US stations in its database. Before that station density in the US was high vs ROW.

    I could do USHCN separately - I agree with GHCN (saying, I think) that there isn't much point in analysing the world at markedly different station densities.

    ReplyDelete
  3. I agree with all that; I just don't remember NOAA's rationale for choosing which US stations would continue to be updated in GHCN, and which would only be updated in USHCN. Maybe they're the stations that were originally highlighted as good candidates for GHCN, I don't know. I forget; I haven't looked at this stuff in ages.

    Either which way, if GHCN-only-US is a reasonable sample, such that using the full density of USHCN just ends up being irrelevant for this analysis, well then, carry on.

    ReplyDelete
  4. upon reflection - higher density couldn't be irrelevant, because you're offering a cherry-picker's guide. with more stations available, more cherries could be found.

    but I think the above is perfectly sufficient; don't extend the work on my account.

    ReplyDelete
  5. CE,
    The US certainly looks the most densely covered. That's partly because it shows, for some reason, less spatial uniformity.

    The formula I used does not cut the US stations totally in accord with the recent reductions. That's because I require 80% of months reported within the period. So over sixty years, a station would quality even if it stopped in 2000, provided it had very good coverage in the earlier years.

    ReplyDelete
  6. In the next to last sentence is 'losotive' a misspelling of 'positive' - or just a term I'm unfamiliar with?

    ReplyDelete
  7. kevin O,
    Yes, indeed. In most of my typos I can at least see how it happened. Not here.
    Thanks, fixed.

    ReplyDelete
  8. Nick:
    Can you do a longer time series on this? Being in 1880 or such?

    Thank you.

    ReplyDelete
  9. Camburn,
    There's a problem that the number of stations with long trends is small, so the map is more patchy. For example, in the just 60 stations post, I looked for rural stations with 90 years of data, and there were just 61. You can look at the map there to see its sparsity. Allowing urban might have doubled the stations.

    But yes, I could do 90 years, say.

    ReplyDelete
  10. Nick:
    I feel like I am a kid in a candy store on your site.

    The reason I asked for a longer term temp trend was to zero in on the region where I live and compare what you would find on a world wide metric verses what the regional anamoly would be.

    The region I live in has not warmed since the 1930's. We have had cool periods, slight warming, then cool and now we are close to the 1930's temps presently.

    Upper mid west USA.

    ReplyDelete
  11. The map seems warm-biased. Wouldn't it be better to show temperatures on a scale relative to the median. The map does not reveal zones that normally trend warmer or cooler. The U.S. for instance is probably an over-contributor to warming, as would be Australia's Simpson Desert. Most of Earth's population exists between 40 degrees south and north latitude; these are the latitudes most affected by El Nino - La Nina; the Pacific Decadal oscillation and so on; it is also (roughly) the zones where the longest thermometer records and the most thermometers are located. On the global map there are also many grids where there are no permanent, nor historical measurements. I'm unconvinced that warming is not related to climate phases and population concentrations.

    ReplyDelete