Scaling Freddie to 10m landcover
- 12 minutes read - 2473 wordsServing vector tiles from 10-meter resolution land cover data. What breaks when you 100x the input resolution, and how topology-preserving smoothing fixes the ugly parts.
Background
Freddie is our tile server for land cover data. It reads GeoTIFF raster files (pixels classified as wood, grass, built-up, water, etc.) and serves them as vector tiles or raster tiles over HTTP. Nothing fancy. Read pixels, make shapes, encode, serve.
The original setup used ESA WorldCover at ~100m resolution. Native resolution sits around z8 in Web Mercator, so serving z0–z8 tiles was straightforward. Sample pixels, run-length encode, merge adjacent faces of the same class, encode to MVT. Done. It just worked, and I didn’t think much about it.
Then we switched to the 10m dataset. Native resolution jumps to ~z13–z14. The source TIFFs balloon to 36,000 x 36,000 pixels each. Everything that worked at 100m either broke or looked terrible. This is the story of fixing all of it.
What breaks
Three problems, in order of how quickly you notice them.
Staircase boundaries
At zoom levels above native resolution (z15, z16), each source pixel stretches across 4–16 tile pixels. The vector boundaries follow pixel edges exactly: 90-degree staircase patterns everywhere. At 100m this was never visible because nobody zoomed past z8. At 10m you can zoom to z16 and the pixel grid is right there, staring back at you. Minecraft terrain, basically.
Performance at low zoom
A z0 tile covers the entire world. With a 36,000 x 36,000 source TIFF, generating that tile means sampling millions of pixels. At 100m the source was small enough that brute force worked. At 10m, a single z0 request could take seconds. I watched the first one come back and thought the server had hung.
Topology seams
The naive fix for staircases (smooth each polygon independently) creates gaps and overlaps between adjacent regions. Two polygons that share a boundary get smoothed in different directions. You end up with visible seams, z-fighting, or missing pixels at class boundaries. Fix one problem, create another. Classic.
COG multi-resolution
The performance problem, at least, has a clean solution: Cloud Optimized GeoTIFF.
A COG file stores the full-resolution image plus downsampled overviews, typically at 2x, 4x, 8x, 16x, 32x reductions. Instead of one 36,000 x 36,000 image, you get a pyramid:
| Level | Size | Use for |
|---|---|---|
| 0 | 36,000 x 36,000 | z10+ |
| 1 | 18,000 x 18,000 | z8–z9 |
| 2 | 9,000 x 9,000 | z6–z7 |
| 3 | 4,500 x 4,500 | z4–z5 |
| 4 | 2,250 x 2,250 | z2–z3 |
| 5 | 1,125 x 1,125 | z0–z1 |
The heuristic is simple: levelIndex = (maxZoom - zoom) / 2. A z0 tile now samples from a 1,125 x 1,125 overview instead of the full image. About 1,000x less data to read.
func (tc *TiffCollection) selectLevel(zoom int) int {
levelIndex := (tc.maxZoom - zoom) / 2
if levelIndex >= len(tc.levels) {
levelIndex = len(tc.levels) - 1
}
return levelIndex
}The TiffCollection manages multiple TIFF files with spatial indexing: a 3 x 3 degree grid for fast lookups. At tile boundaries where multiple TIFFs overlap, deterministic ordering ensures consistent output.
z0 tiles went from seconds to milliseconds. Problem one down, two to go.
The smoothing problem
With COG handling performance, the real challenge is visual quality. Run-length encoding produces pixel-aligned polygons. At native resolution that’s fine: the boundaries are detailed enough. But zoom in past native and you see the grid. It’s ugly, and no amount of squinting makes it acceptable.
Two complementary approaches: smooth the raster before vectorizing, or smooth the vectors after. I ended up doing both.
SDF raster filtering
Signed Distance Fields are typically used in font rendering. Valve popularized them for GPU text back in 2007. Turns out the same idea applies beautifully to classified rasters.
For each class in the tile:
- Compute a distance transform (Meijster algorithm, O(n) per dimension): every pixel gets its Euclidean distance to the nearest boundary
- Sign it: positive inside the class region, negative outside
- Gaussian blur the SDF
- Reclassify pixels based on the blurred SDF values
Blurring an SDF instead of blurring raw pixel values is the key insight. Raw blur smears class boundaries into garbage. SDF blur produces smooth, continuous boundaries that still snap cleanly to class assignments.
The blur radius scales with the upscale factor:
z14 (native, 1x): blur = 6.0 source pixels
z15 (2x upscale): blur = 3.0 source pixels
z16 (4x upscale): blur = 1.5 source pixels
z17+ (8x+): clamped to 1.0A 5 x 5 mode filter can run before SDF to remove isolated single-pixel noise, common in land cover data where a lone “built-up” pixel appears in a forest.
Barbapapa: topology-preserving vector smoothing
SDF smoothing handles the raster level. But even with pre-smoothed rasters, the vectorized output still has pixel-aligned vertices. Near native resolution, where SDF barely does anything, the staircases are still there. We need vector-level smoothing too.
The fundamental problem: polygons that share a boundary must be smoothed identically along that boundary. Smooth them independently and you get gaps and overlaps. I tried independent smoothing first. It looked like the polygons were trying to escape from each other.
The solution uses the half-edge data structure (DCEL) that Freddie already maintains for face merging. Adjacent polygons share vertex references through twin half-edges. Move a shared vertex and it moves for both polygons simultaneously. The topology enforces consistency.
I named the algorithm Barbapapa, after the cartoon character that can reshape into anything while staying the same blob. The algorithm has five stages:
1. Vertex deduplication. After face merging, some vertices at identical coordinates are separate objects. Canonicalize them so shared boundaries reference the same vertex instance.
canonical := make(map[[2]int]*vertex)
for _, v := range allVertices {
key := [2]int{v.X, v.Y}
if existing, ok := canonical[key]; ok {
// Repoint all half-edges from v to existing
replaceVertex(v, existing)
} else {
canonical[key] = v
}
}2. Edge subdivision. Laplacian smoothing can only move existing vertices: it can’t add curvature where there are no points. Subdivide edges by inserting midpoints, spaced at ~3.5 source pixels apart. Both sides of a shared edge (the half-edge and its twin) get subdivided identically.
3. Identify immovable vertices. Tile boundary vertices must not move (adjacent tiles need matching edges). Vertices that exist only on hole boundaries are also pinned (I’ll get to why that caveat matters in a moment).
4. Laplacian smoothing. For each movable interior vertex, compute the centroid of its neighbors and blend:
new_position = (1 - alpha) * current + alpha * centroidJacobi-style: compute all new positions first, then apply all updates simultaneously. This avoids order-dependent results. Three to four iterations with alpha = 0.5–0.6 produces good results.
5. Collision prevention. Ensure no vertex collapses onto another. Sorted vertex iteration guarantees deterministic output.
The algorithm overhead is under 5% of total tile generation time. Most time is still spent on TIFF reading and face merging. Good: the smoothing is essentially free.
The hole vertex problem
Version 1 of Barbapapa had a subtle bug that killed 93% of its effectiveness. I shipped it thinking it was working. It was barely doing anything.
Here’s the setup: small classified regions (a pond in a forest, a building in a field) appear both as a face (the pond) and as a hole in the surrounding face (the forest). The vertices on these boundaries are shared between the face’s outer ring and the surrounding face’s hole ring.
My original code marked all hole-boundary vertices as immovable. The reasoning seemed sound: holes have specific shapes that shouldn’t be distorted.
The problem: a vertex on a small face’s outer boundary is also on the surrounding face’s hole boundary. So marking all hole vertices as immovable pinned almost every interior vertex. In test datasets, 110 out of 118 interior faces had zero smoothing applied. The algorithm was running, doing math, producing output, and changing almost nothing. I only caught it when I added per-face metrics and saw the numbers.
The fix: only pin vertices that appear exclusively on hole boundaries and never on any face’s outer boundary. After the fix, unsmoothed interior faces dropped from 110 to 8, the remaining 8 being sub-pixel features too small to matter. Suddenly Barbapapa actually worked.
Zoom-adaptive parameters
Smoothing needs vary wildly across zoom levels. At z14 (native resolution) the pixels are small, so light smoothing is fine. At z16 (4x upscale) the pixel grid is screaming at you, so you need to be aggressive. Below native, the data is already downsampled and naturally smooth.
Rather than hardcoding per-zoom parameters, everything scales with sourcePixelSize: how many tile pixels each source pixel occupies:
func GetEffectiveParams(sourcePixelSize float64) (iterations int, alpha float64) {
if sourcePixelSize < 1.0 {
// Below native: scale down proportionally
scale := sourcePixelSize
return int(3.0 * scale), 0.5 * scale
}
// Above native: boost with log curve, cap at reasonable limits
iterBoost := math.Log2(sourcePixelSize) * 0.75
alphaBoost := math.Log2(sourcePixelSize) * 0.05
return min(int(3.0+iterBoost), 8), min(0.5+alphaBoost, 0.8)
}Using pixel size directly instead of zoom level means the same code works regardless of source resolution. Swap in a different dataset and the smoothing adapts automatically. One less thing to tune by hand.
Pipeline system
With multiple raster filters and vector simplifiers, the number of possible combinations grew fast. I was adding flags to the server every other day (--sdf, --mode-filter, --barbapapa-iterations, --sdf-blur-radius) and it was getting out of hand. So I built a pipeline system instead.
A pipeline is a named sequence of processing steps:
TIFF Data → [Raster Filters] → Vectorization → [Vector Simplifiers] → MVTBuilt-in pipelines cover common configurations:
| Pipeline | Raster filters | Vector simplifiers |
|---|---|---|
| raw | none | SimplifyFaces |
| default | none | SimplifyFaces, Barbapapa |
| smooth-sdf | SDF | SimplifyFaces, Barbapapa |
| smooth-mode5x5 | Mode 5x5 | SimplifyFaces, Barbapapa |
| smooth-combined | Mode 5x5, SDF | SimplifyFaces, Barbapapa |
| aggressive | Mode 5x5, SDF (8px) | SimplifyFaces, Barbapapa (8 iter), Douglas-Peucker |
Each step collects metrics (timing, face counts, vertex counts) so you can see exactly where time is spent. This turned out to be invaluable for debugging: when a tile looks wrong, you can trace exactly which step mangled it.
./freddie-server --pipeline smooth-combinedOne flag instead of six. Custom pipelines can still be composed from individual filters if you want to experiment.
Determinism
This one bit me late. Go maps have non-deterministic iteration order. It’s by design, to prevent people from depending on it. When iterating over faces to apply smoothing, different runs could produce different vertex orderings, leading to different (though equally valid) output. Visually identical, but different bytes.
For tile caching that’s a disaster. You want the same input to always produce the exact same bytes, otherwise your cache never hits. The fix: an OrderedFaceMap that sorts keys once at construction and iterates in sorted order. Combined with Jacobi-style (simultaneous) updates in Barbapapa, the output is now bit-for-bit reproducible. Run it twice, get the same file. Should have been obvious from the start, but Go’s map randomization is the kind of thing you forget about until it ruins your afternoon.
Performance
Numbers matter, so here they are. Benchmarks on M1 Max, single tile at z8:
| Operation | Time | Memory | Allocations |
|---|---|---|---|
| GetWebMercatorTileMVT | 345ms | 152MB | 599k |
| GetBitmapTileWithMargins | 259ms | 117MB | 10k |
| GetWebMercatorTilePNG | 247ms | 113MB | 10k |
With 10 workers: ~35ms per tile, ~286 tiles/second. A full z0–z8 pyramid (21,845 tiles) generates in about 90 seconds.
I built a benchmark harness that runs multiple pipelines across multiple regions and zoom levels, spitting out per-tile metrics and SVG/PNG outputs for visual comparison. Change a value, regenerate, compare. Without this I would have been guessing at parameter values forever. The difference between “looks right” and “is right” is hard to eyeball across thousands of tiles.
The algorithm’s evolution
Barbapapa went through seven revisions. Each one was driven by a specific visual artifact I found while staring at tiles, the kind of bug where you zoom into some random patch of Indonesia and go “wait, what is that.”
- v1.0: Basic Laplacian smoothing. Worked, but barely visible. I thought the whole approach was a dead end.
- v1.1: Edge subdivision. Suddenly there were actual curves instead of straight lines. The moment it clicked.
- v1.2: Vertex deduplication. Enabled cross-face consistency. Shared boundaries finally moved together.
- v1.3: Linked twin subdivision. Topological correctness for shared edges, fixing the cases where v1.2 missed a link.
- v1.4: Zero-length edge prevention. Eliminated degenerate geometry that made the MVT encoder choke.
- v2.0: Hole-only vertex identification. The 93% fix. The algorithm finally did what it was supposed to do all along.
- v2.1: Deterministic iteration. Reproducible output for caching, after Go’s map randomization burned me.
The benchmark harness made this iteration loop possible. Every revision got validated against the full test suite before shipping: no visual regressions, no degenerate geometry, no byte-level surprises.
Known limitations
Two artifacts I decided to live with.
Fish-tail spikes at z14+. Where two smoothing forces compete at narrow polygon tips, V-shaped spikes can appear. They’re small, they’re rare, and fixing them would mean adding a post-smoothing vertex cleanup pass that I don’t want to maintain.
Integer coordinate quantization. MVT uses integer coordinates. Smoothed positions get rounded, which at very high zoom can produce slightly uneven curves. Floating-point vertices would fix it, but that’s a change that ripples through the entire encoding pipeline. Not worth it for an artifact you have to zoom to z17 to notice.
What changed architecturally
Looking back, the 10m upgrade touched four areas of the codebase, and the interesting thing is how little of the original code it replaced:
Multi-resolution TIFF handling: COG support with automatic level selection. The
TiffCollectionmanages spatial indexing across multiple files. This was the most “engineering” change: straightforward but essential.Half-edge topology operations: Safe edge chaining, twin linking, subdivision. Error handling for topology invariant violations. This is the foundation that makes Barbapapa possible, and the part I’m most proud of. Getting the DCEL right was hard, but once it worked, everything built on top of it was clean.
Pipeline system: Composable raster and vector processing steps with metrics. Named pipelines for one-command configuration. Born out of flag fatigue.
Adaptive parameters: Everything scales with
sourcePixelSizeinstead of zoom level. Swap the dataset and the system adapts. The kind of design decision that pays for itself the first time someone asks “can we try a different source?”
The core tile serving path (HTTP handler, TIFF sampling, face merging, MVT encoding) stayed the same. The new code wraps around it rather than replacing it. I like that.
Where it stands
Freddie serves 10m land cover data from z0 to z16. Low zoom tiles are fast thanks to COG overviews. High zoom tiles look smooth thanks to Barbapapa and optional SDF filtering. The pipeline system makes it easy to experiment with different filter combinations without touching server code.
The 10m dataset covers the full globe at a resolution where you can see individual fields and city blocks. At 100m everything was green-or-gray blobs. At 10m you can tell a park from a garden.