When "mostly aligned" stops being enough
- 19 minutes read - 3995 wordsEvery product starts by depending on open source it doesn’t fully understand. That’s fine. It’s the whole point. You stand on the shoulders of people who solved problems you haven’t even encountered yet, and you ship. The dependency is invisible because it works.
Then the product grows, and you start noticing the edges. The upstream project makes choices you wouldn’t have made. The schema carries fields you don’t use. The routing engine takes a turn you can’t explain. The update process becomes a gamble: bump to the latest version and hope nothing changed in a way that breaks your use case. You’re accountable to your customers for software you treat as a black box.
This is the story of how that played out for us across map tiles, routing, and rendering, and why the hardest part isn’t the engineering.
Not all dependencies are equal
We depend on dozens of open source projects. FastAPI, Django, Pydantic, httpx, pytest, the list goes on. If FastAPI disappeared tomorrow, we’d rewrite some decorators and route definitions. It would be work, but the product would still be the product. These are plumbing. They shape the surface (how requests arrive, how responses leave) but they don’t define what the product is.
Map tiles and routing are different. Without tiles, the maps SDK renders nothing. Without a routing engine, the distance API returns nothing. These aren’t tools we use to build the product. They are the product, or at least the foundation it stands on. An empty shell without them.
The dependency risk isn’t about how many lines of code use a library. It’s about how much of the product’s core value lives in upstream code you don’t control. When that answer is “most of it,” the relationship with that upstream project stops being a convenience and starts being an existential question.
The illusion
Here’s the part nobody talks about.
When you ship an MVP built on third-party foundations, the customer doesn’t know. They don’t know the tiles come from MapTiler. They don’t know the routing runs on Valhalla. They see a Woosmap maps API and a Woosmap distance API. They think they bought a product.
And they did, from their perspective. The API works. The tiles render. The routes compute. The fact that most of the core value is someone else’s software, running in your infrastructure, with your branding on top, is an implementation detail they never see.
So when you sell the MVP, you’re implicitly selling the product three years from now. The customer is buying what they think exists. Leadership is reporting on what they think the team built. And the engineering team knows the truth: that the product, as everyone imagines it, doesn’t exist yet. What exists is a thin layer on top of open source that happens to work.
This creates a brutal dynamic. When you invest in actually building the foundations (replacing vendor tiles with your own schema, understanding the routing engine you depend on, building a style compiler) it looks like you’re doing nothing. No new features. No visible progress. The maps still render. The routes still compute. From the outside, nothing changed.
Worse, the transition introduces blips. You’re replacing running systems with new ones you wrote, and you’re human, so mistakes happen. A regression in tile rendering. A routing edge case the old system handled that yours doesn’t yet. For a brief window, you’ve made things slightly worse in pursuit of making them fundamentally better. And nobody outside the team has the context to understand why.
You’re in a fight against everyone’s perception: customers, leadership, sometimes other engineering teams. Living up to what people think the product already does while quietly building the thing they assumed was there all along.
The pattern
It starts the same way every time.
Phase 1: it works. You adopt the open source project. It does 90% of what you need. The remaining 10% you work around. The trade-off is obvious and good: building from scratch would take months, this takes days. You ship.
Phase 2: it mostly works. The product grows. The 10% gap becomes visible to customers. You can’t change it because it’s upstream. You file issues, maybe contribute patches, but the project’s priorities aren’t your priorities. They’re building for a community. You’re building for a product.
Phase 3: you’re accountable. A customer reports a bug that lives in the upstream code. You can’t just say “that’s an open source issue.” It’s your product. You need to fix it, but fixing it means understanding a codebase that wasn’t written for you, by people who had different goals. The gap between “using” and “maintaining” becomes real.
Phase 4: the fork in the road. You have three options. Contribute upstream and hope the community accepts your direction. Fork and maintain your own version. Or rebuild the parts you need from scratch.
Map tiles: from vendor to vertical integration
Buying tiles
The first version of Woosmap Maps didn’t generate tiles at all. We licensed vector tiles from MapTiler, a flat yearly fee for an MBTiles pyramid. Predictable, simple. We focused entirely on the SDK: the Mapbox GL fork, the Google Maps compatibility layer, the service clients. The tiles were someone else’s problem.
This was the right call. We were building a maps product, not a geodata pipeline. MapTiler tiles worked, the OpenMapTiles schema was well-documented, and we could ship without learning anything about OSM data processing.
By the time we started thinking about generating tiles ourselves, the pricing model had changed. I don’t remember the exact structure anymore, but it was enough of a signal: the terms we’d signed up for weren’t the terms going forward. Beyond pricing, tiles carried 16 layers with attributes we never read, the update cadence wasn’t ours, and the OpenMapTiles CC-BY 4.0 license required visible attribution, which was friction in every enterprise sales conversation.
Running OpenMapTiles ourselves
We took the OpenMapTiles stack in-house. PostgreSQL-based: import OSM data with imposm3 into PostGIS, run generated SQL scripts to transform it layer by layer, then generate vector tiles through per-tile PostgreSQL queries. Same schema, our infrastructure.
It worked, but planet generation took days and the database needed hundreds of gigabytes. And we still had the same schema, the same unused attributes, the same attribution requirements. We’d replaced a vendor bill with an infrastructure bill while keeping every constraint.
Planetiler
Planetiler (originally Flatmap) changed the equation. Java, streaming processing, no database. A full planet in hours on a single machine instead of days on a database cluster. We ran it with the planetiler-openmaptiles profile: same output, fraction of the cost.
But we still had the schema problem. Faster generation of tiles you don’t control is still tiles you don’t control.
Clean-room schema
The real break was writing our own Planetiler profile from scratch. Not a fork of OpenMapTiles. A clean-room reimplementation with our own layer names, our own attributes, and nothing we don’t use.
transportation became roads. transportation_name became road_labels. housenumber got dropped entirely. Almost every layer is written in Kotlin because the logic is too specific for declarative YAML: transit network propagation from OSM relations, building height arithmetic, multi-script labels across 85 languages, roads pre-split by bridge/tunnel structure.
The style compiler
Owning the tile schema without owning the style is half the job. A production Mapbox GL style is thousands of lines of JSON. Roads alone need six variants each across eight road classes.
Elzar, our style compiler, generates that JSON from Python code. The tile schema and the style are designed in lockstep. When the Kotlin tile generator emits a hide_3d attribute on buildings, the style compiler references it in the same PR. Schema and style can’t drift because they live in the same repo.
The spec question
Once you own both the tiles and the style generation, a new question surfaces: does the Mapbox GL style spec itself actually fit what you’re building?
The spec was designed for a general-purpose renderer serving a general-purpose schema. It has no concept of road structure, so a single road class needs six style layers (tunnel casing, tunnel fill, road casing, road fill, bridge casing, bridge fill) because brunnel is just another filter attribute. Eight road classes times six variants is 48 layers just for roads, before labels and shields. It’s like using assembly to build a web server: technically possible, but is it reasonable? Complex logic gets encoded as JSON expression trees (nested arrays of ["case", ["all", ["has", "name:nonlatin"], ...]]) that are technically correct and practically unreadable.
Elzar makes this manageable. Define a road once, get six variants. Use ~Q(hide_3d__exists=True) instead of raw filter arrays. Pre-splitting roads into roads_tunnel, roads, and roads_bridge at the tile level was our way of optimizing around the spec, moving work from render time to tile generation time because the style format couldn’t express it efficiently.
Elzar would stay regardless of the output format. Even if we designed a custom style spec, we’d still want to define styles in Python (something executable, debuggable, testable) rather than editing a declarative format by hand. The compiler’s value isn’t papering over the Mapbox spec. It’s that Python is a better authoring language for styles than any JSON dialect ever will be. The compiler stays. Only its target changes.
The renderer
That’s where Nimbus, the GPU map renderer experiment in Rust, enters the picture. If you control the renderer, you control the style format. The Mapbox spec stops being a constraint and becomes a choice. You can support it for compatibility while designing something better for the pipeline you actually have.
A style format designed for our schema wouldn’t need six layers per road class. It would know about brunnel structure because the tiles already express it. Tiles that carry exactly the data the style describes. A style format that maps naturally to the tile schema. A renderer that understands both natively. And Elzar at the center, generating whatever output the renderer needs from the same Python definitions we already have.
Routing: the black box problem
OSRM
The distance API started on OSRM. It’s fast and well-proven. But it loads the entire road graph into RAM, and for a world-scale service, that’s a hard constraint. The memory footprint made it impractical for the coverage we needed.
Valhalla
Valhalla solved the memory problem. Tile-based graph storage means you can serve the world without loading it all into RAM. We migrated, it worked, and for a while the “mostly aligned” phase was comfortable.
Then we started hitting the edges.
Valhalla is a massive C++ codebase maintained by a community with its own priorities. When a customer reports a routing bug (a wrong turn cost, a route through a service road that should be avoided) we can’t just fix it. We’d need to understand why the cost model works the way it does, trace through code that wasn’t written for our use case, and either patch it locally or convince the upstream community that our fix is the right one.
The tile update process is where the black box problem gets painful. When we want to regenerate Valhalla tiles with fresh OSM data, we typically also bump Valhalla to the latest version, because the version that generates tiles and the version that serves them sometimes need to match. But a version bump means inheriting every change the community made since our last update. New cost model tweaks we didn’t ask for. Behavioral changes we don’t understand. Route quality regressions we discover from customer complaints, not from changelogs.
We’re accountable for this software. Customers don’t care that Valhalla is open source. They see a Woosmap API returning a bad route and they file a ticket with us. And we’re stuck between “we don’t understand why it does that” and “we can’t change it without forking a C++ project we don’t deeply know.”
The C++ ecosystem makes this harder. Build systems, dependency management, debugging tooling: none of it is as accessible as the JVM or Rust’s cargo. Taking deeper ownership of Valhalla isn’t just a knowledge problem, it’s an ergonomics problem.
Calculon
Calculon is the “what if we built it ourselves” experiment: a routing engine in Rust. Bidirectional A*, many-to-many distance matrices, driving/cycling/pedestrian profiles, Valhalla-compatible API. It reuses Valhalla’s tile format (designing a graph format from scratch would be its own project) but everything else is new code.
It’s not a Valhalla replacement yet. But when a cost model produces a weird route, we can read the code. When we want to change how turn penalties work, we change them. When we bump a dependency, it’s a Rust crate with a changelog we understand, not a C++ monolith where any subsystem might have shifted.
There’s a middle path too: taking deeper ownership of Valhalla itself, contributing changes that serve our use case. But that requires investing heavily in a C++ codebase whose architecture and community may or may not align with where we need to go.
When replacing a black box makes things better
The pattern isn’t always painful. Sometimes replacing a dependency you don’t control with something you wrote yourself turns out to be a straight upgrade.
The Woosmap Store Locator used Mapnik (a C++ rendering library) to generate map tiles showing store locations. Mapnik is a serious piece of software, battle-tested for raster map rendering. But it wasn’t designed for our use case: rendering MVT vector tiles with point data. It had become a dependency we couldn’t easily update, couldn’t debug when it misbehaved, and couldn’t adapt to our evolving needs.
We replaced it with a Python renderer built on Cairo. The new renderer was simpler: written in a language the whole team knows, doing exactly what we need and nothing more. And it turned out to be more performant. Less load on the database. Enough of an improvement that we could scale down the RDS instance.
A full-featured C++ rendering engine, replaced by a focused Python implementation, and the result was faster and cheaper. Not because Python is faster than C++. Obviously it isn’t. But because a tool built for your specific problem, that you understand completely, can be optimized in ways a general-purpose tool never will be. You know which queries to run, which data to skip, which shortcuts are safe. The black box can’t know any of that.
The ones still early in the cycle
Not every dependency has reached the accountability phase yet. Maparazzo, our static maps renderer, wraps Mapbox GL Native’s C++ core in a Python library for headless map image generation. Right now it’s in phase 1: it works. The C++ renderer produces correct images, the Python wrapper makes it callable from our FastAPI services, and static maps ship to customers.
But the signs are already there. The C++ renderer leaks memory: OpenGL contexts hold state after objects are destroyed. We’ve worked around it by pooling Map instances and reusing GL contexts, but the leak is in code we inherited, not code we wrote. When it misbehaves, debugging means diving into a massive C++ codebase with its own conventions, its own build system, its own history of decisions we weren’t part of.
It’s comfortable enough today. The problems will come. And when they do, Nimbus (which can already render our tiles headlessly from Rust) is the path out. Same pattern, different timeline.
The open source dependency lifecycle
The uncomfortable truth is that depending on open source unknowingly makes you a maintainer. Not in the “you have commit access” sense. In the “you’re responsible for its behavior in your product” sense.
For an MVP, “mostly aligned” is enough. The open source project does 90% of what you need, the community is active, the abstractions hold. You’d be foolish to build from scratch. And the community’s goals don’t need to perfectly match yours. They just need to overlap enough that you can ship.
But products have customers. Customers have expectations. When the open source project’s goals diverge from yours, even slightly, you feel it. Not as a single breaking change, but as accumulated friction. Features you can’t add because the schema doesn’t support them. Bugs you can’t fix because the codebase is impenetrable. Updates you can’t skip because the format requires version parity.
And ownership doesn’t stop at the obvious dependencies. Once you own the data and the generation, you discover that the formats and specs are dependencies too. The Mapbox style spec. Valhalla’s tile format. These are constraints you inherited from the ecosystem, and they shape what’s easy and what’s hard in ways that aren’t visible until you try to push past them.
The question isn’t whether to take ownership. It’s when, of what, and how deep. Do the changes you need benefit the community? Would they accept a PR, or is your direction orthogonal to theirs? Is the ecosystem accessible enough to invest in, or is the cost of understanding it higher than the cost of rebuilding?
For tiles, we kept Planetiler as the engine and own everything else: schema, style, and the compiler that connects them. For routing, the answer is still forming. For the style format and the renderer, we can work around the constraints today, but we can see the path to not having to.
“We have no vision”
I’ve heard this from engineers on the team. That we’re only reactive. That we just respond to customer requests and there’s no plan. And I understand why it looks that way: if you think phase 1 is the end, then everything after looks like firefighting.
But if you lay out the phases, the vision is obvious. We have years of deliberate work ahead: owning the tile schema, closing the style spec gap, making the renderer a first-class citizen, taking control of routing. Each phase builds on the last. Each one makes the product more ours and less a wrapper around someone else’s decisions. That’s not reactive. That’s a roadmap.
The problem is that this roadmap is invisible to anyone who thinks the product was finished when the MVP shipped. And that’s the majority of people: customers, leadership, and yes, most engineers. They see the API, it works, so the hard part must be done. Everything after is maintenance.
This is exactly the same problem as technical debt, just at a different scale. Everyone will sign up for taking on debt if they think it ships something faster. Add a dependency you don’t understand. Skip the tests. Use the upstream schema as-is. Ship it. Move on to the next thing, take on more debt there too. And when someone proposes paying it down (replacing Mapnik with our own renderer, writing a clean-room tile schema, building a style compiler) the response is “why? it works.”
It works until it doesn’t. And by then the debt has compounded. The codebase is half-rotten, everyone complains about it, but nobody wants to stop adding features long enough to fix it. The boy scout rule (leave the code better than you found it) gets dismissed as slowing things down. So the rot spreads.
Supply chain debt works the same way. Every dependency you don’t understand is a loan against your future ability to move fast. The interest is paid in debugging time, in workarounds, in version bumps that break things you can’t explain, in customers waiting for fixes that live in upstream code you can’t change. And just like financial debt, the longer you ignore it, the more it costs.
The vision isn’t a feature roadmap. It’s the systematic transfer of ownership from upstream to us, one layer at a time, while the product keeps running. It’s unglamorous, often invisible, and it’s the only way to build something that lasts.
My role as CTO is to grow the technological asset of the company. Features are what the product does. The asset is what the product is. The PO grows the former. I grow the latter. When the asset is a thin wrapper around upstream dependencies, it’s fragile. Anyone can replicate it. Every phase of ownership makes it thicker, harder to reproduce, more valuable. That’s not maintenance. That’s building the company’s core value.
And that means making these phases crystal clear to everyone. If the team doesn’t see the multi-phase roadmap, they’ll assume there isn’t one. If leadership doesn’t understand why replacing a working renderer with a new one matters, they’ll see wasted effort. If engineers think phase 1 is the destination, they’ll resist every investment in phase 2 through 5 as unnecessary complexity.
And here’s where I haven’t done a good enough job. When the foundational work isn’t visible (or worse, when I end up doing it on my own because it’s hard to justify in sprint planning) it creates the illusion that there’s only one stream of work: the product owner’s backlog. Feature requests, customer issues, integration tickets. That becomes the only work that feels real, the only work that gets discussed in standups, the only work that counts.
Engineers naturally gravitate toward pleasing the PO. The PO has the backlog, the priorities, the stakeholder pressure. The CTO has… a vague sense that the foundations need work. If I haven’t made the supply chain roadmap as concrete and visible as the feature backlog, I can’t blame anyone for treating it as optional.
But it’s a dangerous dynamic. A team that only serves the PO’s backlog is a team that only adds features to an MVP while the foundations quietly rot. The PO’s job is to maximize customer value now. The CTO’s job is to make sure the product can still deliver that value in two years. Those aren’t opposing goals, but they operate on different timescales, and if the long-term one isn’t articulated clearly enough, it loses every single sprint planning meeting.
The vision exists. It’s in the progression from bought tiles to clean-room schema. It’s in Calculon sitting next to Valhalla. It’s in Elzar generating styles that match a tile schema we designed. The work speaks for itself, but only if someone tells the story clearly enough that it becomes a roadmap, not a side project. And that someone is me.
There’s a harder question underneath all of this: is growing the technological asset even possible when the company is purely ARR-driven?
ARR optimizes for now. Ship the feature that closes the deal. Fix the bug that stops the churn. Hit the number this quarter. Every sprint is a negotiation between what moves the metric today and what makes the product real. And the metric always wins, because it’s concrete: a number on a dashboard, a target in a board deck.
The technological asset doesn’t have a dashboard. There’s no metric for “we now own our tile schema” or “we can debug a routing issue in hours instead of weeks.” The value is real but it compounds invisibly: faster iteration, fewer black-box surprises, the ability to say yes to product requests that would have been impossible before. None of that shows up in ARR until months or years later, and by then nobody connects the cause to the effect.
A company that’s purely ARR-driven will always deprioritize foundational work. Not out of malice. Out of measurement. If the only thing that counts is what moves the number this quarter, then building the asset that makes next year’s numbers possible will always lose. And you end up in a trap: a product that looks successful from the revenue side while its foundations quietly thin out, making it harder and more expensive to deliver on the promises that drive that very revenue.
The uncomfortable truth is that ARR and the technological asset need each other. ARR without the asset is a product that gets harder to maintain every quarter. The asset without ARR is an engineering project with no customers. The CTO’s job is to make the case that both need investment, and that starving one to feed the other is a debt that always comes due.
You start by depending. You end by being accountable. What happens in between is the difference between an MVP and a product.