Our largest docs sites now render ~6.4x faster

6 min readSandeep DineshChris Broadfoot
The words 'Making Fern faster' over a motion-blurred night landscape of grassy hills under a starlit sky.
Listen as podcast
0:00

TL;DR: Every Fern docs site used to be one giant JSON blob, rewritten on every publish and reloaded in full on every page view. We broke it into small, content-addressed pieces, so a one-page edit only touches one page and the largest sites render ~6.4x faster.

Fern hosts documentation for companies building developer platforms. Docs are where a prospect forms a first impression of the product, and where existing customers go to figure out how to actually use it.

But as our customers grew into bigger organizations, slow page loads and publishes stopped being minor friction and started getting in the way of the basics: finding information and shipping updates.

How it used to work

A Fern Docs site is composed of many different pieces, including markdown pages, API references, images, config, and navigation. For a long time all of them lived as a single compressed JSON blob in our database: one object per site that was overwritten on every publish and read back in full to render every page load.

This was a reasonable design when sites were small, but started breaking as sites grew into the kind of multi-version, multi-product scale our largest customers run today. For example, one of our largest customer's site looks like this:

  • 2,000 content pages
  • 50 API versions, each with
    • 5 MB OpenAPI spec
    • 500 endpoints per spec
    • 1,500 schemas per spec
    • 600 API reference pages
  • 150 navigation sections

The data for 35,000 routes was being loaded to serve a single page. Cold render times varied between 1-3 minutes. We now render the same request in 5-10 seconds.

The problem with one big object

When an entire site is a single object, the whole thing becomes the unit of work. You can't touch a small part without dragging the rest along.

Fixing a typo on one page meant re-uploading the entire site, even though 99.9% of it was identical to what was already stored. Rendering a single page meant loading the whole blob into memory and deserializing it, which got slow enough on big sites that it outgrew the caching layer meant to make it fast. And every publish overwrote the last one, so there was no stored history. The new system unlocks the ability to diff and rollbacks.

These problems compounded. Big blobs made page loads slow, slow loads pushed us toward heavier caching, and heavier caching made content go stale. At the worst of it we were seeing repeated outages, page loads well past 30 seconds, and large sites occasionally falling mid-render.

The fix: store the pieces, not the whole

The key realization was that most of a docs site doesn't change between publishes — a typical edit touches one or two pages out of hundreds. As a result, we rebuilt storage around three ideas:

  • Break the site into small pieces,
  • Identify each piece by a hash of its content
  • Never modify a piece once it's written.

Identifying content by its hash makes "did this change?" trivial: same content, same hash, reuse what's there. A site is stored in layers — each page, the section it belongs to, and a top-level snapshot — so changing one page writes a new page, section, and snapshot, and shares everything else with the previous version. A single-page edit on a large site now writes a few kilobytes instead of re-uploading a multi-megabyte blob.

The same rewrite let us flatten URL resolution. The old system walked a nested navigation tree at request time; now every valid path is written out as its own row at publish time, so resolving a URL is one lookup instead of a tree walk.

Before vs after: the old read path pulled a 10-50 MB monolithic blob and hit Postgres; the new one is a short cascade of small, content-addressed S3 blobs.

Publishing got cheap

Publishing a documentation site now works as a quick back-and-forth between the CLI and our backend. First, the CLI tells the backend what the site looks like; the backend figures out what's actually new and hands back upload spots for just the missing pieces. The CLI then uploads those pieces straight to storage, so the file bytes never pass through our servers, and tells the backend to finish up.

If nothing changed, the backend notices immediately and stops. That matters more than it sounds: plenty of CI pipelines publish on every merge whether or not the docs were touched, and those now finish in milliseconds instead of doing pointless work.

We kept the heavy processing server-side on purpose. Improving how we break a site into pieces is just a backend deploy, so fixes reach every site at once, with no CLI upgrade required.

Reading got fast

When someone opens a docs page, our servers render it, and rendering had the same whole-blob problem as publishing. The old read path could reach back through several internal services before rendering anything, so a slow dependency anywhere blocked the page.

The new path is just a few reads straight from storage, and they're the same layers the site is stored in: a small top-level pointer holding the site's shell and a map of where everything lives; a bundle with the routes and sidebar for a section of pages; and the page content itself.

Everything except that pointer is identified by content and never changes, so it caches effectively forever. The pointer has a short freshness window of a few seconds, and it's the only thing anyone waits on. Publish, the pointer flips over, and old cached pieces are simply never referenced again. No cache to bust.

This also killed a longstanding annoyance: the old setup had one cache entry per site, so changing anything invalidated everything. Now caches are scoped to what actually changed — a config tweak leaves your page content alone, and editing a page's body doesn't touch the pages around it.

The result: typical large (5,000+ routes) sites saw a ~5x improvement in average cold load times, dropping from 10+ seconds to 2.5 seconds.

Making sure nobody noticed

A storage rewrite on a live platform is only a success if it's invisible — every page on every site had to render identically before and after. So most of the actual effort wasn't the storage design; it was proving the new path matched the old one.

Screenshots turned out to be a weak signal: they cover few pages, two pages can look identical while differing in ways that matter (redirects, metadata, auth), and when they differ it's often for boring reasons like fonts. So instead we defined a clean description of what a site actually is — routes, pages, redirects, navigation, config — leaving out incidental details like database IDs. The rule: same input through both pipelines should produce the same description. Same site out.

We also kept an honest ledger of what happened to every field as it changed formats, and nothing could be dropped without a written reason. The budget for user-visible changes was zero. And rather than only testing cases we thought of, we generated lots of valid-but-weird site configurations automatically, since those odd combinations are exactly what break migrations. Any mismatch one found got saved as a permanent test.

Where it landed

The numbers on real customer sites tell the story. Time to first byte on some of our largest sites:

CustomerBeforeAfterImprovement
NVIDIA10.6 seconds2.07 seconds5.1x
Merge3.5 seconds0.55 seconds6.4x
Payabli14.6 seconds2.7 seconds5.4x
Really large site1 minute5 seconds11.4x

What's next

Over the coming months we'll migrate every existing site onto the new storage system and retire the old storage and render paths. For existing customers, the switch is automatic: there's no action needed and no downtime.

And if you're wasting engineering time on docs infrastructure instead of your actual product, try Fern for free or book a demo.