The Problem with Elevations

Blog: 15-May-2026

I've been aware for years that my code for looking up Ordnance Survey elevation data was doing something weird when tallying up ascent and descent values for GPX files recorded by my GPS device when out for a bike ride. I had tested individual spot heights against OS maps, and was happy that I was retrieving the correct elevation value for each individual location: Here's a screenshot from my elevation code demo page showing a good, calculated OS elevation value:

Spot height at Ben Nevis peak — *The mouse position derived elevation exactly matches the OS map*

But when tallying up all the derived elevations to obtain total ascent and descent, I always got values which were way higher than tallying up any elevation values recorded in the GPX file by the GPS device itself. Note that not all GPS devices record elevations, but many do these days. That's why rides recorded using my (current) GPSMAP 64 device show elevation data, but others from my (previous, ancient) Etrex device don't.

Elevation chart with weird elevation values — *Huge difference between the recorded and calculated elevations*

For a long time I just assumed that my OS looked-up values were simply more accurate than my GPS-recorded values. After all, the GPSMAP 64 is very old by current standards, and it makes no claim to great elevation accuracy. But I noticed over time that other people's GPX files had similar discrepancies, and sometimes these differences were really large. Surely they can't all be wrong? So what was going on here?

I should mention that my lookup code by default adds in infill locations if any two locations (waypoints) are further apart by more than approx. 50m. The idea is that this returns a more accurate elevation profile rather than only considering the spot heights for the waypoints in the GPX file. GPS units typically use some intelligence when recording in order to prolong battery life, so the distances between recorded waypoints can vary tremendously; they can be just a few metres to some hundreds of metres apart. So calculated infills seemed like a good idea, but something was definitely awry.

I'd been meaning to look into this issue for years, but it's very time-consuming to write test code and deal with large numbers of data values. And the variables (OS elevation accuracy, GPS elevation accuracy, and the distance between waypoints) make it a non-trivial exercise to make any sense of it all. And not forgetting the distinct possibility of good old-fashioned code bugs! Testing all of this seemed like more effort than it worth to be honest, so it got shelved as "good enough, won't fix". I knew it wasn't really "good enough" but I just couldn't raise the energy to dig into it.

But the arrival of powerful LLMs changed things. They excel at solving boring difficult problems with lots of data! So I decided to re-open the problem and initially asked Kimi 2.6 for a solution in general terms. It suggested adding a threshold value when comparing adjacent elevation values to smooth out differences. The idea was that that if the threshold was set to (say) 3m, then the code would only tally up differences which exceeded this value. That seemed to help, but not always, and I wasn't sure if inserting infills was making things better or worse. So I decided to ask Kimi to take a proper look at the code and run it against some large GPX files to get to the bottom of it all.

Here is what it had to say after it had thought for a while (the "third party" mentioned below is OS Maps, who allow you to upload GPX files and get elevation values. I think we can assume that if anyone can tally up elevations correctly, it's them):

I've run extensive tests on both your GPX files against the actual OS Terrain 50 binary data. Here's what I found.

The Short Answer

There is no logic bug in the infill code. The get_infills() function in OSTerrain50Reader.php correctly generates intermediate points along straight-line paths without duplicates, coordinate errors, or other defects.

The 30–50% increase you see is caused by sampling the 50m DTM grid more densely combined with THRESHOLD = 0 in your JavaScript tallyAscentDescent() function, which causes every tiny elevation fluctuation to be accumulated.

Test Results

Here are the actual numbers from your two test files:

Cotswold.gpx (third party claims 16,624 ft)

Mode	Points	Threshold=0	Threshold=4m	Threshold=10m
`doInfills=false`	1,548	16,332 ft	14,974 ft	12,485 ft
`doInfills=true`	6,330	23,606 ft	18,582 ft	13,449 ft

L2B.gpx (third party claims 3,607 ft)

Mode	Points	Threshold=0	Threshold=4m	Threshold=10m
`doInfills=false`	1,406	4,277 ft	3,162 ft	2,565 ft
`doInfills=true`	2,908	5,456 ft	3,575 ft	2,405 ft

What's Happening

With doInfills=false, you under-sample the terrain. For Cotswold, your average GPX point spacing is ~200m. You only read elevation at every 4th grid cell along the path, so you miss small ups and downs. Your result of 16,332 ft is very close to the third-party claim (16,624 ft) because you're both making the same coarse approximation.
With doInfills=true, you sample at ~50m intervals — the same resolution as the OS Terrain 50 grid. You now capture every little bump and dip. This is mathematically more accurate, but because your THRESHOLD is 0, you accumulate noise.
The OS Terrain 50 data has 4m RMSE vertical accuracy. Adjacent 50m grid points can differ by several metres purely due to measurement uncertainty. With threshold=0 and 6,330 points, these small differences add up to thousands of feet of artificial extra ascent/descent.
Applying a threshold fixes the problem. Notice that with doInfills=true and a 4m threshold, L2B gives 3,575 ft — essentially a perfect match to the claimed 3,607 ft. Cotswold drops from 23,606 ft to 18,582 ft. The threshold filters out the grid noise while preserving real terrain features.

Why Your THRESHOLD Changes "Made Little Difference"

In the code you sent me, THRESHOLD is hardcoded to 0:

const THRESHOLD = 0

You mentioned you tried altering it. If you changed it in the file but didn't see a difference, the most likely explanations are:

The browser was caching the old JS file
The change wasn't deployed to the server being tested
The change was made in a copy of the file that isn't being imported by the page

My tests confirm that changing the threshold makes a massive difference (see tables above).

Is There a Bug in `get_infills()`?

I tested three variants of the infill algorithm:

Original (>50m, using round())
Min 75m gap before infilling
Min 40m segment length

All three produced virtually identical results:

Variant	Cotswold thr=0	L2B thr=0
Original	23,606 ft	5,456 ft
Min 75m gap	23,432 ft	5,381 ft
Min 40m segment	23,606 ft	5,456 ft

The infill logic is sound. The issue is not how the infills are created, but how the elevations are processed afterwards.

Recommendations

Keep doInfills: true — it captures real terrain detail that sparse sampling misses. L2B proves this: infills + 4m threshold = 3,575 ft (matches the authoritative 3,607 ft), while no infills + 4m threshold = 3,162 ft (12% low).
Change THRESHOLD from 0 to 4 (or even 5) in ElevationChart20260516-03.js. This matches the OS Terrain 50 RMSE and eliminates noise accumulation. The 4m value you tried conceptually was correct — it just needs to actually be active in the running code.
If you want values closer to third-party services for Cotswold specifically, you might also set doInfills: false. But be aware this is trading accuracy for consistency — you're deliberately missing terrain detail that the 50m grid actually contains.

There is no logic bug in the PHP infill code. The disparity is a classic signal-processing issue: sampling a noisy signal at its native resolution without any low-pass filtering (threshold) inevitably amplifies the noise.

So there we have it. I suspect a mathematician, engineer, geologist, or cartographer could have told me this years ago when I first noticed the issue. But I am none of those, so I never really got to the bottom of it until Kimi's massive brain took a look for me. My code now has a threshold value set to 4m as suggested, and the big differences in ascent and descent values are largely gone, or at least are now less glaringly egregious. So my conclusions are:

The accuracy of domestic GPS devices cannot be determined, and is sketchy at best
The accuracy of authoritative data providers is probably more reliable, but is still sketchy (4m RMSE for OS Terrain 50 data)
there isn't a definitive way to accurately measure ascent and descent
no-one knows the voodoo that the OS Maps folk use to produce their elevation values
close enough is good enough