releases.shpreview
Neon/Neon Blog

Neon Blog

Mon
Wed
Fri
JunJulAugSepOctNovDecJanFebMarAprMayJun
Less
More
Releases12Avg4/mo

Three new Neon CLI commands - neonctl link, checkout and env pull - make branch-first development the default for you and your agents.

Andre Landgraf–Staff Developer Advocate

Jun 10, 2026

We're hard at work shipping the private preview of Neon Platform and while we're at it we're rethinking Neon's developer experience from the ground up. A lot of that work is about making branch-first development the default, not something you have to wire up yourself.

The first piece is shipping today: three new Neon CLI commands that make branch-first development easier than ever. neonctl link, neonctl checkout and neonctl env pull. They benefit you whether you only use Neon Serverless Postgres or you're reaching for more of the Neon Platform going forward. And they're especially useful once you hand the feature-dev loop to a coding agent.

Link your workspace to a Neon project

neonctl link connects your local dev workspace to a Neon project, the same way vercel link does for your Vercel project.

neonctl link

It's interactive: pick a Neon org, pick or create a project and branch, then the CLI writes the org, project and branch IDs into a local .neon file. That file is git-ignored by default, so it stays a per-developer pointer.

And for agents that don't do interactive mode, there's neonctl link --agent.

If you're a long-time neonctl user, you'll know this was already possible through the lower-level set-context command. link is just an interactive wrapper on top of the lower-level primitive.

Once a workspace is linked, project- and branch-scoped commands stop needing --project-id and --branch flags. For example, this lists every branch in the linked project, no arguments required:

neonctl branch list

That's convenient on its own. It becomes a game-changer once you add env pull.

Pull branch-scoped env vars with env pull

neonctl env pull fetches the current branch's Neon environment variables into your existing .env file, or .env.local if you don't have one. You can also point it at any file with --file.

neonctl env pull

It pulls the env variables based on your current branch's enabled services. You always get DATABASE_URL and DATABASE_URL_UNPOOLED (the pooled and direct Postgres connection strings), plus the Neon Auth and Data API secrets when those services are enabled on the branch. And as our preview features roll out (object storage and an AI gateway) you'll get those too if you're using them. Only the Neon-managed keys are written, so everything else in your file is left untouched.

There's no branch ID to pass, because it's already in .neon. Giving your teammates, both human and agent, an isolated Neon branch for development is great. Pulling all the branch-scoped connection details for that branch by hand is the annoying part. env pull is the fix.

In fact, you'll rarely run env pull yourself: as you'll see below, link and checkout run it for you, so pinning a branch and pulling its env are a single step.

Switch branches with checkout

The last ingredient is switching branches. Neon brings git-like branching to Postgres (and soon the full suite of Neon backend primitives), so inspired by git again, we added a quick utility command for it: checkout.

neonctl checkout dev-add-search

neonctl checkout <branch-name> lets you create a branch or check out an existing one. Run it without a name and you get an interactive branch picker with a create option:

neonctl checkout

Checkout updates the branch identifier in your .neon file so the next CLI commands target that branch.

The branch-first dev loop

Put them together and branching has never been easier to integrate tightly into your everyday dev workflow. Run neonctl link once when you start on a project:

neonctl link

Then run neonctl checkout whenever you'd reach for git checkout -b, at the start of every feature, fix or experiment, to give that work its own isolated Neon branch:

neonctl checkout dev-add-search

And as I mentioned earlier, neonctl env pull already runs under the hood for both link and checkout, so your .env (or .env.local) already holds the right Neon environment variables for the branch you just checked out. Your app always points at the branch you're actually working on and nothing leaks between features.

You can still run neonctl env pull directly to refresh a branch's env or pull a different branch into a specific file with --file.

Hand these commands to your agent

Branching is great for devs but essential for agents (I make a ton of mistakes but my agents even more so). An agent can run neonctl checkout between tasks to give itself a fresh, isolated database per feature with no shared state to corrupt and no connection strings to copy around.

I'd add a small paragraph to my own AGENTS.md (and CLAUDE.md) so my agents pick up the loop by default. Something like:

For every feature, whenever you run `git checkout -b`, also run `neonctl checkout <fitting-branch-name>` to create a matching Neon branch for that git branch. This also pulls the branch's Neon env variables into our local `.env`, so our database credentials are always branch scoped.

On top of that, install the new neon agent skill. It teaches your agent the full branch-first dev workflow with Neon plus other Neon Platform best practices:

npx skills add neondatabase/agent-skills -s neon -s neon-postgres

Then prompt your agent to do branch-first feature development and let it drive the loop.

What if you don't want env vars on disk?

Writing secrets into a local .env is the right default for most local dev. I bet you have a bunch of .env files on your machine even though you post about using varlock on X! However, neonctl env pull is optional. You can opt out and we give you alternative ways to inject your Neon env variables for local dev.

First, pass --no-env-pull to opt out of the bundled pull:

neonctl link --no-env-pull
neonctl checkout dev-add-search --no-env-pull
Inject it at runtime

Our new @neondatabase/env package gives you a few ways to inject your branch's env at runtime instead of writing it to disk:

  • neon-env run to inject env into your dev command
  • fetchEnv to resolve env in code
  • neonctl dev (coming soon) for Neon Functions

neon-env run fetches your branch's Neon variables and injects them into the child process, so nothing is written to disk:

npm i @neondatabase/env

neon-env run -- npm run dev

fetchEnv does the same in code. Pass it your neon.ts config and the branch to resolve and it hands you back a typed env object you can read at your app's bootstrap:

import { fetchEnv } from "@neondatabase/env/v1";
import config from "./neon";

const env = await fetchEnv(config, { projectId, branchId });
console.log(env.postgres.databaseUrl);

neonctl dev is coming soon for Neon Functions. It runs your local dev server with that same branch env injected, so the branch-first loop carries straight over to Neon Functions development:

neonctl dev
Pull it into your env manager

And if you're actually using varlock, @neondatabase/env also ships a neon-env export command that prints your branch's env to stdout as dotenv lines or JSON. You can pull Neon straight into a varlock .env.schema with a bulk loader:

# .env.schema
# @setValuesBulk(exec(`neon-env export --format json`), format=json)

Now varlock resolves your Neon branch's DATABASE_URL (and friends) on demand and you still get varlock's schema validation and secret redaction on top.

Make it stick

To make sure you never forget the flag, wrap the commands in your package.json scripts:

{
  "scripts": {
    "neon:checkout": "neonctl checkout --no-env-pull",
    "dev": "neon-env run -- next dev"
  }
}

And for your agents, add a line to your AGENTS.md so they follow the same rule. For example: "Use neonctl checkout <branch> --no-env-pull and never run neonctl env pull; env is injected at runtime via neon-env run."

Wrapping up

Our agents ship feature after feature (or even in parallel). Branching is key to isolating your infra per feature. Our goal is to give you the best primitives for branch-first development: use them as documented here or make them your own and build your own abstractions on top!

This is the first of several DX improvements landing as we build toward the Neon Platform private preview, with more CLI commands and new SDKs on the way. If there's something you wish the Neon CLI did, drop into the Neon Discord and tell us.

And if the little Neon Platform teasers got you interested, sign up for the private preview here.

Happy coding!

Starting June 1, 2026, every Neon paid plan includes 500 GB of data transfer per month, up from 100 GB. The change is automatic and appears on your June invoice.

We're increasing the amount of public data transfer included in every Neon paid plan, from 100 GB to 500 GB per month. This completely removes data transfer (or "egress") charges for most workloads.

The change takes effect automatically on June 1, 2026, and applies to all paid plans. There's no action required, and the new allowance will appear on your June invoice.

Why we're doing this

Few things are more frustrating than an unexpected egress charge landing on your invoice. A chatty client, a backfill, a misconfigured connection, or a noisy analytics job can move more data than you expect, and the cost often shows up after the fact, when it's too late to do much about it.

Raising the included amount to 500 GB removes that surprise for most Neon customers.

If you do move more than 500 GB in a month, anything above the included amount incurs data transfer fees exactly as before. Nothing else about how data transfer is metered or priced is changing.

What you need to do

Nothing at all! The higher allowance applies automatically to every paid plan on June 1, 2026, and will be reflected in your June invoice.

If you want to see your own usage, you can track egress in the Neon Console under your billing and usage details. Questions about the change are welcome in the Neon community on Discord.

In the agentic era, APIs are the product. I gave my coding agent the Neon API, our OpenAPI spec, and the TypeScript admin SDK, and asked it to rebuild the Neon dashboard. Meet Neon Slop Fork.

Andre Landgraf–Senior Developer Advocate

May 28, 2026

Agents are changing how we build software and how we interact with developer tools. Agents are much better at using APIs, CLIs, and MCP servers than clicking through dashboards.

They can use UIs. Tools like agent-browser are getting pretty good. But navigating complex dashboards is still slower, more fragile, and harder than giving an agent an API key and letting it use programmatic tools directly.

That's why APIs matter even more in the age of agents. The best developer platforms are becoming agent-native by exposing functionality through:

  • REST APIs
  • CLIs
  • MCP servers
  • SDKs

This means agents can become far more autonomous. They can provision resources, inspect state, automate workflows, and manage infrastructure without needing to "drive" a UI or ask their human to provision and manage infrastructure for them. With the auth.md spec, agents can even sign up for services themselves.

This is already happening everywhere:

  • Salesforce is pushing hard into APIs and agents with Salesforce headless.
  • Google Cloud is investing heavily into MCP and agent tooling.
  • X recently announced their own MCP server.

Following the same line of thinking — creating the best possible developer and agent experience — at Neon, we aim to expose all platform functionality through our open API. If you see an action in our Neon console, we should also have the associated endpoint be open as well. Going even further, we're releasing APIs for things that aren't even exposed through the console yet — like our new consumption metrics API. To illustrate this, I slop forked the Neon console.

Meet Neon Slop Fork

Screenshot of the Neon Slop Fork dashboard showing the branch overview for the "my-app" project — compute, storage, history, and network transfer metrics, plus the primary compute and connect button.

I gave my coding agent:

Then I told it: "Go recreate the Neon dashboard — make no mistakes."

And no mistakes it did.

The fork supports:

  • project management
  • branch creation
  • compute controls
  • consumption views
  • Neon Auth
  • Data API
  • and more

I skipped billing, payments, and team management because that would require integrating email flows and Stripe APIs. Everything else is powered directly through Neon APIs.

How I built it

All it took was a few prompts and tools:

  • Cursor as the coding agent. Any SOTA coding agent should get you to the same place — pick whichever one you already pay for.
  • agent-browser as a tool the agent could call to drive a real browser, click around the live Neon console, and compare what it had built against it.

I started with a few setup prompts to scaffold the repo (create a public GitHub repo, drop in Next.js + Better Auth + a small meta DB for virtual orgs and users, point it at my real Neon org, deploy to Vercel). Then one main prompt:

Clone the Neon console end-to-end on top of the public API. No shortcuts, no mocks. If the public API can't do something, disable the button. Use agent-browser to verify each screen against the real Neon dashboard before you say you're done.

Then, of course, I had to follow up with a few short fix prompts to make it feel finished:

  • Add spinners or skeletons for loading UIs so nothing feels frozen.
  • Gray out anything the public API doesn't cover (sign in with GitHub, billing, team invites).
  • Fix the reflow in Backup & Restore.
  • Match the sidebar tabs to the real Neon console — drop the ones we don't have, line up the ones we do.
  • Inspect Neon's branch-creation modal with agent-browser and port the previous data and TTL options over.

Rough totals:

  • About 20 prompts.
  • ~6M input tokens and ~300k output tokens. Most of the spend is input.
  • ~$100 in Claude Opus credits, give or take.
  • Around half a Saturday afternoon.

I almost certainly did more than I needed to. You can easily one-shot a workable version with 1-2 prompts and 1-2 hours of agent run time.

agent-browser or similar tools for letting the agent navigate both the real and slop-forked dashboards allow the agent to verify and fix more autonomously, which has been a huge unlock for longer and more successful agent runs for me.

Why to slop fork

First, it's a pretty cool demonstration of how capable the Neon API already is. You can build a fully custom dashboard on top of Neon today. That's important not only because of agent-access but also because Neon itself is used as infrastructure behind other platforms.

For example:

  • Vercel's Neon Marketplace Integration
  • Netlify DB is powered by Neon
  • Replit uses Neon for Serverless Postgres
  • Laravel Cloud uses Neon as well

Those platforms manage fleets of Neon databases through the same APIs you can access yourself.

So why not build your own management surface too?

Slop forks as product prototyping

This gets even more interesting internally. A slop fork is actually a very fast way to prototype product ideas. For example, we're currently exploring object storage capabilities at Neon. With a fork like this, it becomes trivial to experiment with:

  • new UI concepts
  • dashboard layouts
  • workflows
  • resource management experiences

Then see how it feels. That feedback loop is much faster than local development on the real Neon console codebase.

Build your own Neon dashboard

The fun part is that this idea is not limited to Neon employees. You can fork this project and build your own custom Neon dashboard around your workflow.

Maybe you want:

  • tighter integrations with your own tooling
  • custom observability views
  • internal deployment workflows
  • simplified controls for your team
  • AI-native workflows
  • or a completely different UX philosophy

Go build it! And if you think the official Neon console should work differently, tell us. We genuinely want that feedback — drop into the Neon Discord and let us know what's missing.

But sometimes the best workflow for you is highly specific to your own projects and infrastructure. APIs make that possible.

In the agentic era, APIs are the product.

With that said: slop fork Neon and happy prompting!

We're building the boring backend for apps and apps and agents

Why Neon is expanding beyond Postgres into a branchable stack of backend primitives — auth, data API, object storage, compute, and an AI gateway — for the agentic era.

Everyone has been talking about throwing it all away and building entirely new magic sci-fi cloud infrastructure for agents.

Amidst all the hype, this tweet stood out to me as a voice of reason:

"The agent-native cloud needs boring primitives more than magic. Identity, permissions, logs, rollback, and cost controls before the sci-fi layer."

@rtheoryxyz

Building real infrastructure is hard enough as it is. AI has only raised the stakes, dialing up the operational requirements and pushing the limits in new and unexpected ways. When autonomous agents or developers move at breakneck speeds, applications break. "Magic" won't help you recover when a runaway agent deletes your production database and all its backups. Robust, familiar infrastructure with rollbacks, AI-friendly APIs and higher operational capacity is the way forward.

The Hard Requirements of the AI Era

When we founded Neon four years ago, the core principles laid out in our Hello World post were aimed at helping human developers move faster. As luck would have it, the AI era has shifted those exact principles into the "hard requirements" column:

  • Low entry cost: When code generation is free and instant, even a $5 upfront infrastructure cost is a non-starter.
  • Branching: Code has always had isolated environments, but the data stack lacked them. This created a massive gap in the ability to experiment safely.
  • Serverless: Infrastructure should live automatically in the background, scaling instantly to meet shifting usage demands. A backend shouldn't be t-shirt sized; it should precisely match what the application demands of it.

Human developers make mistakes (cue the Matrix meme: "Only human"). But AI coding agents make mistakes at a blistering, automated velocity that traditional infrastructure simply wasn't designed to handle. Without strict guardrails, agents will tear down systems just as quickly as they build them.

An Agentic Stack Built by Systems Engineers

Neon's serverless Postgres branching changed how developers work by ensuring every single database change could be validated in an isolated environment. At this point, we start tens of millions of branches every day. Now, we're taking the same copy-on-write, instant branching approach and applying it to the full suite of backend primitives today's agent stack requires.

The Complete Agent-Native Backend
  • Postgres Database — ✅ Available
  • Authentication — ✅ Available
  • Data API — ✅ Available
  • Object Storage — 🔜 Coming Soon
  • Compute — 🔜 Coming Soon
  • AI Gateway — 🔜 Coming Soon

Scaling with Enterprise Muscle

One year after joining Databricks, the benefits are showing on both sides. Lakebase, the same technology as Neon on Databricks, is the fastest-growing new offering in Databricks history. In turn, being part of a larger company has helped us grow our database team with world-class engineers, improve platform performance, lower costs, and now ship mature, battle-tested products to developers on Neon:

The AI Gateway for example already handles more than 125 trillion tokens a month, hardened by rigorous enterprise requirements for day-0 model coverage, high availability, deep metrics, logging, and granular cost controls.

To be clear: We are not shifting focus away from our core database product. Postgres remains the bedrock for everything we do. The Neon team has aggressively expanded within Databricks, and we've hired top-tier, senior engineering talent from other major database services. We are expanding our platform by building entirely new, dedicated teams while simultaneously growing our core Postgres engineering powerhouse.

We're building the boring infrastructure layer. Go build sci-fi.


FAQ

Does this mean you're focusing less on database?

No. The same storage and compute technology powers both Neon Serverless Postgres and Databricks Lakebase, so every improvement to the core engine benefits both products. Lakebase serves large enterprise customers; Neon serves startups, agent platforms and individual developers. Both are growing, and that growth funds a bigger systems engineering team, not a smaller one.Today, around 120 engineers work across storage, compute, proxy, and Postgres itself, including upstream contributions. The new primitives (auth, object storage, compute, AI gateway) are built by new, dedicated teams. We're adding to the platform, not redirecting from it. To accelerate progress of the core database platform, we've brought in senior engineering talent from other major database and cloud services over the past year. The Postgres team is the largest it's ever been.

Are you building an entire cloud platform?

No. We're focused on the primitives that apps and agents need to function: database, authentication, data API, object storage, compute, and an AI gateway. These are the pieces where branching, instant provisioning, and scale-to-zero matter most. For everything else, you'll still want the tools you already use. Front-end hosting (Vercel, Netlify), email (Resend), error tracking (Sentry), and so on. We're not trying to replace them.

Why AI Gateway?

The lines are starting to blur between applications and agents, but regardless of what we call them, the lifeblood of what everyone is building today is inference - we're bringing reliable/scalable inference directly to you when you build your backend in Neon.

We're not building this from scratch. Databricks already operates an AI gateway that handles trillions of requests a day for everyone from fortune 500 enterprises to popular coding agents, with day-0 coverage of new models, rate limiting, logging, metrics, and cost controls.

When will the new primitives be available?

Authentication and the Data API are available today. Object Storage, Compute, and the AI Gateway are coming soon. If you want early access, sign up above and we'll reach out when each one is ready.

Will existing Neon projects need to change?

No. Your existing databases, branches, and connections keep working exactly as they do now. The new primitives are additive. Adopt the ones you need, ignore the ones you don't.

In the last year, agents have strained the limits of cloud infrastructure with new usage patterns: higher throughput of control-plane operations, more demand for on-demand infrastructure, and capacity crunch. The resulting spate of failures and incidents amongst cloud services has taught us lessons that inform our reliability roadmap...

We've managed to give customers up to 5x performance increase on write-heavy workloads by disabling full-page writes, a Postgres durability safety feature that is made redundant by Neon's own storage engine.

David Wein, Vlad Lazar

May 07, 2026

note

This is a cross-post of an engineering blog that was originally published on Databricks. Neon and Databricks Lakebase both run on the same technology, and this engineering optimization benefits customers of both platforms.

In Neon's lakebase architecture, compute and storage are separated by design. While this separation was originally built for operational flexibility, including scaling, branching, and instant recovery, it also unlocks a massive performance frontier.

By decoupling these layers, we can offload work from your Postgres compute to our distributed storage in ways that are structurally impossible in traditional, monolithic Postgres deployments. In this post, we will explore how we exploited this architectural advantage to eliminate a decade-old Postgres bottleneck to improve Postgres write throughput by 5x, while reducing read tail latencies by 2x and WAL traffic by 94%.

The hidden cost of traditional Postgres durability

To understand how we achieved a 5x improvement in managed Postgres performance, we have to look at how traditional Postgres handles durability.

In Postgres, every database change is first saved to a sequential log (the Write-Ahead Log, or WAL) to ensure data isn't lost in a crash. To keep crash recovery times fast, Postgres periodically performs a background cleanup event called a "checkpoint." Unlike a snapshot, a checkpoint is simply a milestone marker in the log. During a checkpoint, Postgres takes all the modified data currently in memory (managed in 8KB chunks called "pages") and flushes it to the main disk, up to a specific point in the log. If a crash happens, Postgres restores your data by starting at that checkpoint milestone and replaying the recent WAL logs over the disk.

However, there's a risk: if the server crashes exactly while saving an 8KB page to disk, the page might only get partially written, resulting in a corrupted "torn page." If Postgres tries to replay a tiny log update over a torn page, the data is permanently ruined. To fix this, Postgres has to ensure it never relies on a corrupted disk for recovery.

It does this using a "Full Page Write" (FPW). The very first time a page is modified after a checkpoint milestone, Postgres doesn't just log the tiny change; it copies the entire 8KB page into the WAL. If a crash happens and the disk page is torn, Postgres ignores the ruined disk, grabs the pristine 8KB backup from the WAL, and uses that as the perfect starting point to replay the rest of the logs. While this guarantees absolute safety, it is expensive: on write-heavy applications, logging entire 8KB pages can inflate log volume by up to 15x, often becoming the system's biggest performance bottleneck.

Neon storage eliminates the risk of torn pages

In Neon your compute is stateless. It does not rely on a local data directory. Instead, it streams WAL to a Paxos-based quorum of safekeepers.

Because there is no local-disk page to tear, the failure mode FPW was designed to prevent simply does not exist. However, naively turning off FPW creates a secondary problem: read performance. Without those periodic full page images in the log, the storage layer would have to replay an infinitely long chain of small deltas to reconstruct a page for a read request. What was once a bounded O(checkpoint frequency) replay becomes an unbounded chain, leading to a spike in read latency and resource consumption.

Image generation pushdown to distributed storage

We solved this by moving the intelligence from the compute node to the storage layer. We call this image generation pushdown.

When Postgres compute requests a page from storage, the pageserver (a component of Neon's distributed storage system) reconstructs it by finding the most recent materialized image of that page and replaying any WAL deltas on top. The full page images that the compute used to embed in WAL doubled as periodic reset points in that delta chain, naturally keeping the chain reasonably bounded and reads fast. For a deeper treatment of this mechanism, see Deep dive into Neon storage engine.

With full page writes disabled, those reset points disappear. Without additional intelligence in the distributed storage system a frequently-updated page could accumulate a long chain of small deltas with no intervening image. The result would be an undesirable increase in read latency and resource consumption as the pageserver replayed the entire chain to serve a read, increasing latency and resource consumption.

To avoid this problem we pushed down the image-generation responsibility from the compute's WAL stream into the storage layer, preserving the bounded read behavior of storage while still eliminating the WAL overhead on the compute. The pageserver now generates full page images when a page has accumulated more delta records than a configured threshold without an intervening image. This is a naturally better approach because the decision to generate a new image is based on the actual number of changes to a page rather than the unrelated Postgres checkpoint process.

Here's why this is significantly better for performance:

  1. Network efficiency: The compute sends only the compact deltas, which are the actual changes, leading to a 94% reduction in traffic in our benchmarks.
  2. Scalability: Work is moved from the single Postgres writer to the distributed, independently scalable storage layer. Image generation for a project branch is now shared across multiple pageservers in the background.
  3. Optimal reads: When images are generated is now based on actual changes to a page rather than the unrelated Postgres checkpoint process.

Quantifying the impact: from lab to production

We benchmarked this optimization using HammerDB TPROC-C (a TPC-C derived OLTP benchmark) and validated the results across real-world production workloads.

1. Serverless compute scaling

Throughput is measured in new orders per minute (NOPM). The gains scale dramatically with the size of the compute instance:

Compute sizeBefore (NOPM)After (NOPM)Throughput gain
4-vCPU78,87694,89120%
16-vCPU95,832269,1892.8x
32-vCPU95,686439,3004.5x+

On a 32 vCPU compute, the improvement exceeded 450%.

With full page images generated on compute, each transaction generates 58Kb of WAL on average. With image generation pushed down, that drops to under 4Kb — a 94% reduction. The throughput improvement follows directly: less WAL means less contention on the write path, less network bandwidth consumed, and less work for the storage layer to ingest.

By removing Postgres's FPW bottleneck, we allowed throughput to scale linearly with compute resources. This is something monolithic Postgres struggles to do under heavy write load.

2. Real-world production validation

In a production environment for a high-profile 56 vCPU project, enabling image pushdown reduced steady-state WAL generation from 30 MB/s to just 1 MB/s.

Prod customer WAL rate (lower is better)

This decrease in volume correlated directly to increased transaction throughput during daily peaks.

This did not just help writes. By optimizing the delta chains, the number of WAL records that must be applied per read dropped significantly. We saw p99 read latencies drop by 30% to 50% and p50 latencies drop by approximately 30%.

Prod customer throughput (higher is better)

Zooming out, at the regional level, post enablement we saw the total amount of WAL generated by computes drop by up to 4x. P99 latency of reads from the storage engine improved by up to 3x and became much more stable.

Regional WAL ingest rate (lower is better)

Regional storage p99 page retrieval latency (lower is better)

3. Synced tables

For data-intensive Synced Tables (A special feature of Databricks where analytics tables are automatically synced to Postgres), the impact was immediate. One customer saw ingestion throughput jump from 17k rows per second to 62k rows per second, which is a 3x increase, simply by enabling image pushdown.

Prod customer sync table (higher is better)

FPW's seamlessly turned off for all databases

Since late March, we have rolled this out across our entire fleet. It is now active for all Neon databases globally.

The change was applied to running computes via our control plane and storage system, which coordinated the transition automatically. This was achieved using the existing Postgres XLOG_FPW_CHANGE WAL record mechanism, meaning no restarts or interruptions were required for our customers.

What's Next

Neon's lakebase architecture was built for flexibility, but it was designed for performance. Pushing down full page writes is part of a systematic effort to harvest the benefits of storage and compute separation.

Just as we introduced cache prewarming for zero-downtime patching, we are continuing to move heavy-lifting tasks away from your transactions and into our scalable background storage stack. The Postgres write tax is officially a thing of the past.

Rest easy knowing you'll get alerted if your spend hits a certain level

Your team ships a new feature, traffic spikes, and autoscaling does its job. Great — until the bill arrives and it's three times what anyone expected. By then it's too late to do anything about it.

Most cloud providers handle this the same way: you find out what you spent after you've already spent it. Monitoring tools can help, but they live outside your database console, require separate setup, and still only tell you what happened — they don't give you a way to act on it.

We think cost controls should be built in, not bolted on. That's why we're introducing spending limits for Neon organizations.

What are spending limits?

A spending limit is a monthly dollar threshold you set for your organization. When your spend approaches or reaches that threshold, Neon takes action — today that means email alerts, and soon it will mean the option to automatically suspend project computes.

You set it once, and it works in the background. No external monitoring to configure, no third-party integrations to maintain.

How it works

Setting up a spending limit takes about 30 seconds:

  1. Navigate to your organization's Billing page in the Neon console.
  2. Find the Spending limit card and click Enable.
  3. Enter a monthly dollar amount.
  4. Choose what happens when the limit is reached — Send email alerts is available now, with Suspend projects coming soon.
  5. Click Enable, and you're done.

Once enabled, your Billing page shows a progress bar with your current spend relative to the limit. Neon checks your organization's spend every 15 minutes.

Approaching and exceeding the limit

When your organization reaches 80% of its spending limit, org admins receive an email alert. A second alert fires at 100%. In addition, a banner appears across the Neon console so that everyone on the team has visibility — not just whoever set up the limit.

The banner includes a direct link to adjust your limit, so you can react immediately without navigating through the billing page.

Editing or disabling

Org admins can edit the dollar amount or disable the spending limit entirely at any time from the same Spending limit card. Changes take effect on the next 15-minute check cycle.

What's coming next

Today, spending limits notify you. Soon, they'll be able to enforce.

The Suspend projects option is already visible in the setup dialog with a "Coming soon" badge. When it ships, reaching your spending limit will automatically pause compute for all projects in the organization — a hard guardrail that prevents runaway costs without requiring anyone to be online to react. Computes will resume automatically when the limit is raised or a new billing cycle begins.

This gives teams two levels of control: alerts for awareness, suspension for enforcement. Use one or both depending on how tightly you need to manage spend.

Get started

Spending limits are available today for all organizations on paid Neon plans. Head to your Billing page to set one up.

Have feedback or questions? Let us know in Discord or check the spending limits documentation for more details.

Lessons from making Neon's docs agent-readable: MDX-to-Markdown pipelines, content negotiation, llms.txt structure, and a scan of 250+ doc sites.

Philip Olson–Documentation Engineer

Apr 23, 2026

A year ago, if you asked an agent about Neon, you got whatever it half-remembered from training. Now it goes looking and reads what it finds. Our docs were written for humans who scroll, not machines that fetch.

We've been fixing this in pieces, not all at once. This post is what worked, what didn't, and what we're still figuring out. Maybe it saves you a few curl commands.

The setup

Agents can read HTML fine. Crawlers have been at it for decades and modern agents handle it well. We just think we can do better. Our pages are built from dozens of rendered React components (<Admonition>, <CodeTabs>, <DetailIconCards>, <Steps>), which expand into nested <div>s, class names, and event handlers in the final HTML. The actual docs are buried in there somewhere.

You might think: just serve your source MDX from GitHub. We once did, and it works. MDX is Markdown with React components mixed in. Our MDX uses 30+ custom ones, and some, like <SharedContent>, inline text from separate files at render time. An agent reading the raw MDX just sees the tag.

You will correctly say: convert them. We do now, after plenty of yak shaving.

Phase 1: hand-maintained text files

Our first approach: ask Claude for "one of them cool llms.txt things that all the kids are talking about." It produced a public/llms/ directory, one .txt file per doc page, and an enormous llms.txt index listing them all. Keeping them current was a handful of Python scripts, run by hand, no CI.

It worked. The thinking at the time was "feed the models" not "serve the agents" (the spec itself leans that way). Live fetching was new and rare. Predictably, the files drifted from the source, went missing, went stale weeks at a time. The implementation was an afterthought because the use case still felt like one.

The lesson: if keeping two copies in sync is a manual job, they will drift. Clearer now than it was at the time.

Phase 2: teach the site to recognize agents

What if the site detected agent requests and served something cleaner than HTML? We built middleware that checks the User-Agent (ChatGPT, Claude, Cursor, Copilot, and others) and the Accept header. When either matches, we serve Markdown instead.

What we actually served was raw MDX from GitHub's API with a text/markdown content type. Technically Markdown-ish, practically Markdown with a pile of React components. We hit GitHub rate limits within hours, switched to pre-built local files, still MDX. Detection was solved, content was not.

Phase 3: converting MDX to Markdown

I (okay, Claude) wrote a Node.js post-build script that converts MDX to Markdown and writes it to public/md/, which we serve via URL rewrites.

For example, <CodeTabs labels={["Node.js", "Python"]}> becomes labeled code blocks. <SharedContent> tags inline the referenced text directly. About 30 components handled, all from one file.

The processor builds ~1,400 files in a few seconds. Doc authors edit MDX as usual. No manual sync, no drift, no thought.

Context matters too

Clean Markdown isn't enough. Agents need to know where they are and what to read next. So we wrap each page with a breadcrumb at the top and related docs at the bottom:

> This page location: Connect to Neon > Connection pooling
> Full Neon documentation index: https://neon.com/docs/llms.txt

...

## Related docs (Connect to Neon)
- [Connect to Neon](https://neon.com/docs/connect/connect-intro)
- [Choosing your connection method](https://neon.com/docs/connect/choose-connection)

...

Without it, an agent fetches one page and doesn't know what else is nearby.

What other sites are doing

Nikita (Neon's fearless leader) has a habit of pointing people back to first principles. It's why we tend to build small tools instead of guessing, even when the tool's whole point is to see how others are doing it. Ours, a scanner, probes doc sites and measures how they serve content to agents: same URL as HTML, with .md appended, Accept: text/markdown, discovery headers, plus variations. Findings across over 250 sites, mostly tech docs such as Vercel, Stripe, Mintlify, Sentry, and Google:

  • 53% serve Markdown by appending .md to the URL.
  • 41% honor content negotiation via Accept: text/markdown. The ones that do also tend to have llms.txt, discovery headers, and structured indexes. They've thought about agents. About 30% also accept text/plain.
  • llms.txt is common but placement varies. 93% of polled sites have one, and 58% also publish llms-full.txt with concatenated doc content. The standard says place llms.txt in root. In practice, sites put it at /docs/llms.txt, at the root, or both. Some have different content at each path, and some use sub-indexes (child llms.txt files within llms.txt).
  • 404 handling is mostly not content-type aware. Only 9% return Markdown for a 404 when Markdown was requested. The rest return HTML, and a handful return empty responses, even when the agent clearly asked for Markdown via .md or Accept: text/markdown. Of those 9%, most sites return 200 instead of 404 (we chose 404).
  • Discovery hints are rarely used, and the conventions aren't settled. Only 9% include a <link rel="alternate" type="text/markdown"> tag in the HTML head, a convention that emerged organically (ours did). The X-LLMs-Txt and Link: rel="llms-txt" headers Mintlify proposed have adoption almost entirely driven by Mintlify itself.
  • Headers are mixed and the impact is unclear. Only 3% set Vary: Accept on HTML (6% on Markdown). 27% set noindex on Markdown. We're still figuring out which of these actually help versus which are habit.

Doc-specific platforms like Mintlify, GitBook, and Fern score near 100% on most of these, because agent readiness is the point. Open-source frameworks are further behind and could use agent advocates. Tooling exists in the community but often sits unmaintained.

A few more lessons

404s should be helpful and aware, not empty. Our 404s match the request: HTML for browsers, Markdown for agents, the latter returning links to the full index, the complete docs bundle, and the API reference. Idea stolen from a Vercel tweet and implemented immediately.

Discovery has to be automatic, and responding to agents has to be too. Agents don't know to look for llms.txt or that appending .md works. Set discovery headers on every HTML response so they find out, and honor Accept: text/markdown when they do ask. Like children, they often ignore the reminders, but we do our best as parents.

The index needs structure, not just a list. Our first llms.txt was a flat list of over 1,000 URLs. Way too much to parse before deciding what to read. We now restructure it with sections and descriptions, sub-indexes for large areas, a "Common Queries" section at the top (pricing, connection methods and troubleshooting, API reference), and collapsed routes for large but useful content (changelog, Postgres tutorials). The primary index is now ~200 entries with sub-indexes for the rest.

Agents use HTTP clients, not browsers. Looking at User-Agent strings, we saw axios, got, node-fetch as often as named agents. Claude Code uses axios, Cursor uses got. The agent identity is in the tool, not always the header. We added those patterns to the detection list. A false positive (Markdown to a human) is harmless; a false negative (HTML to an agent) defeats the purpose. A real question: is changing content based on who's asking a form of cloaking?

What the system looks like now

Four layers:

  • Build time. The MDX processor converts source docs to Markdown. The index generator builds llms.txt, sub-indexes, and llms-full.txt (all docs concatenated).
  • URL rewrites. Appending .md to any doc URL serves its Markdown version from public/md/. Non-doc pages will follow.
  • Middleware. Detects agents via User-Agent and Accept headers. Serves Markdown transparently. Adds discovery headers to HTML responses.
  • Content. Every doc page gets navigation context. The index is hierarchical. 404s are helpful and content-type aware.

What we'd do differently

One URL, two ways to ask for Markdown. We built a parallel /llms/ namespace first. Eventually we moved to serving Markdown from the canonical URL via a .md suffix or an Accept: text/markdown header. That should have been the starting point.

Invest in analytics earlier. We added agent traffic tracking late. Having it from the start would have shown which pages agents request, which ones they 404 on, and how they navigate. That data would have shaped our system sooner.

Design the index first. The flat file list was an afterthought. Structuring it with sections, descriptions, and sub-indexes earlier would have made it more useful.

Build the scanner first. Studying other sites first would have saved us from reinventing patterns and surfaced cracks we didn't think of until later.

None of this was planned from the start. It came together one small change at a time.

What's next

Humans reach docs through agents, not just browsers. That's the new audience and it doesn't execute JavaScript or follow visual navigation. Agents want plain text, structured metadata, and machine-readable discovery. The tools aren't exotic: a remark pipeline, some middleware, a few HTTP headers, a config file. The hard part is recognizing that and choosing to serve them.

An agent can implement most of this for you. What it can't do is write good content without review.

Community tooling is catching up. The afdocs scorecard flagged a coverage issue in our llms.txt that we were briefly convinced wasn't our problem, but it was. The associated agent doc spec is also growing, turning ad-hoc conventions into something documented. The tools are new, the category is new, and everyone is figuring it out together.

On our list:

  • Focus on accuracy. Continue testing whether an agent can complete tasks using a given doc page, similar to agent skills testing. Goal: fewer mistake-then-fix cycles.
  • Offer interfaces built for agents. Like search APIs, and ways for them to send feedback when we get something wrong. Markdown is a human format agents happen to parse well, and we can do better than that.
  • Think more about agent skills. There's something wrong with committing .claude folders into every repo. Treating them like devDependencies feels saner, and we're watching how this evolves.
  • Continue integrating tools like afdocs. Discuss with maintainers and submit PRs to include more (optional) checks, such as 404 handling and headers.
  • But most importantly, what every doc site has tried to do since the dawn of time: write good, reliable content. Treat docs like code, like tests, like the source of truth.

None of this is magic. Just small, honest work that only matters if the content is worth reading.

Thanks

Thanks to Neon and Databricks for letting engineers experiment (and for the tokens), and to my docs-team colleagues Dan and Barry for keeping the real docs moving while I poked at this.

Give your Codex agent Neon superpowers

Andy Hattemer–Member of Product Staff

Apr 16, 2026

An official Neon plugin is now available in the OpenAI Codex marketplace. It connects Codex directly to your Neon databases through MCP, so you can provision and manage Postgres databases without leaving your workflow.

Video

Once installed, Codex can interact with your Neon account, not just read static guidance about it. You can ask it to create a new project, spin up a branch for a feature, run a migration, validate a connection string, or query your schema. It understands Neon-specific concepts like branching and autoscaling, so you get steps that are actually correct for how Neon works.

What you can do with it

The plugin bundles three components:

  • Neon Postgres app — gives Codex MCP-backed tools to create and manage projects, branches, and databases, run SQL queries, and validate connections.
  • Neon Postgres skill — guides Codex through Neon-specific workflows: connection patterns, ORM setup, branching strategies, autoscaling, and Neon Auth.
  • Neon Postgres Egress Optimizer skill — helps diagnose and reduce data transfer costs when egress is higher than expected.

A few things that become straightforward once the plugin is connected: setting up a new Serverless Postgres database and getting a working connection string for your framework, creating an isolated branch before running a migration, or asking Codex to walk through reducing egress without digging through docs manually.

How to add the plugin

To get started, open the plugins menu in Codex, search for Neon, and click install. If you prefer the CLI, run codex, then /plugins to find and add it.

Once connected, you can manage your Neon databases directly from Codex. Ask it to pull your schema, insert rows, create projects, create branches, or run queries. The results show up right in the chat window.

Ship faster with Codex

Database provisioning, branching, migrations — these have always been necessary but rarely the interesting part of building. Giving Codex the tools to handle them closes the loop: the agent can now take a task from code to running database without handing off to you for the operational steps in between.

Try it today, open or download Codex and install the Neon plugin!

The first of several features that make compute restarts invisible.

note

This is a cross-post of an engineering blog that was originally published on Databricks. Neon and Databricks Lakebase both run on the same technology, and this engineering optimization benefits customers of both platforms.

Ensuring customer databases are always available is one of the most important things we do in Neon and Lakebase. We've designed the system with redundancy at every level, automatically failing over and recovering your database in the event of hardware or software failures.

In a large-scale system, such unplanned failures are a statistical expectation, but for an individual database, they're not that frequent. For an individual database, planned maintenance tends to cause more workload disruption. After all, a typical database is patched more frequently than it experiences hardware failure.

Today, nearly every database provider operates with maintenance windows: scheduled periods where your database severs all active connections and gets updated and restarted in a process that can take anywhere from a few seconds to minutes. While Neon lets you schedule updates at a time that's optimal for you, it's still a brief interruption when it happens.

We think we can do better. This blog post is the first in a series on how we're leveraging the lakebase architecture with separation of compute and storage to eliminate the impact of planned maintenance entirely. Our goal: make version updates and security patches completely unnoticeable.

In this post, we'll cover prewarming: a technique that prevents any performance degradation that follows a database restart. In future posts, we'll discuss improvements to the failover process itself and additional optimizations that bring us closer to true zero-downtime patching.

The Problem with Cold Restarts

The challenge with restarting Postgres is that in-memory caches (specifically the buffer cache and local file cache) are lost. Even though the database is back online very quickly (1 second @ P99), the workload may experience a slowdown in the first minutes after restart – we saw a ~70% reduction in pgbench TPS. This is due to a low cache hit ratio while data is read back from storage and the cache warms up. While this might seem like only a performance problem, it can be an availability issue if the slowdown is severe enough that the database cannot keep up with the workload and timeouts occur.

Techniques to address this exist in Postgres: pg_prewarm can be used to warm up buffer caches. However, this runs after a restart when the workload is already impacted. Streaming replication can be used to set up a replica, which can be prewarmed before failing over to it (promoting it to primary). However, this requires creating a full replica and carefully orchestrating the prewarming before failover.

Prewarming on Neon's lakebase Architecture

In the lakebase architecture, we combine stateless, elastic compute nodes with disaggregated, shared storage. The compute nodes employ local caches to deliver maximum performance without sacrificing serverless properties. While the cache faces the same cold-start issues outlined above, we have more options with the Lakebase architecture.

Since Neon's Postgres compute replicas are stateless, we can spin them up and down on demand. We utilize this and combine it with automatic prewarming on planned restarts to minimize the performance impact on the workload. This is how it works:

  1. A new version of Neon's Postgres compute image becomes available. You receive a notification and can schedule the restart for a time that works for you.
  2. Shortly before the scheduled time, our control plane spins up a new Postgres compute in the background. You don't see it, and you're not billed for it. The current primary's workload is unaffected.
  3. A list of pages in the current primary's cache is sent to the new compute. The new compute loads those pages into cache from our shared storage tier without impacting the primary.
  4. The new compute subscribes to the WAL (write-ahead log) to keep its cache up to date. For efficiency, unlike a normal Postgres replica, it can ignore all WAL records that do not affect its cache. It gets the WAL from our Safekeepers, putting no additional load on the primary compute.
  5. When prewarming is complete, we quickly shut down the old primary, promote the new compute to primary, and switch it in. Promotion uses the standard pg_promote from OSS Postgres and does not restart the database server.

BEFORE:

AFTER:

With Neon's lakebase architecture, you get this at no additional cost, without paying for additional replicas. All planned restarts of read/write endpoints in all regions are now performed this way without you having to do anything. Soon we'll be extending it to read-only endpoints as well.

Results

To measure the impact of cold caches, we ran 10 GB pgbench (scale factor 670) on a database while restarting it – first with prewarming enabled, then without prewarming . The first chart shows a read-only workload (pgbench "select only"), while the second shows a read-write workload (pgbench "simple update").

In both cases, we see that throughput recovers nearly instantly with prewarming. Without prewarming, recovery is much slower while the cold cache is warming up. The difference is starkest for the read-only workload because prewarming improves the cache hit ratio which helps reads proportionally more than writes.

On this page
  • The Problem with Cold Restarts
  • Prewarming on Neon's lakebase Architecture
  • Results
<p>For most of 2025, AI coding agents got good at a specific thing: writing code. Give an agent a prompt, and it could scaffold an app, wire up an API, write migrations. But when the code was done, the agent stopped. Spinning up a real database, creating an account, getting credentials into the environment&#8230; that [&hellip;]</p>
<p>&#8220;I&#8217;m genuinely surprised by how well it handles that scale. You can create tons of databases and they&#8217;re available immediately. You can branch out immediately. All of those things make it really nice for agent-managed infra.&#8221; Iman Radjavi, Co-founder, Specific.dev What Specific builds Specific (YC F25) is a cloud platform designed for coding agents. With [&hellip;]</p>
<p>There are a few different reasons to hit the brakes on a Postgres query. Maybe it’s taking too long to finish. Maybe you realised you forgot to create an index that will make it orders of magnitude quicker. Maybe there’s some reason the results are no longer needed. Or maybe you, or your LLM buddy, [&hellip;]</p>
<p>“The biggest strength of Neon is how it decouples storage and compute and makes them independently scalable. When an app isn’t being used, the compute node can be put in idle mode at extremely low cost, which lets us handle a wide range of scale and complexity without compromise.” (Nilesh Trivedi, co-founder and CTO at [&hellip;]</p>
<p>From the start, the team at Encore has been focused on solving a simple problem: shipping production infrastructure shouldn’t require a dedicated platform engineering team. They set out to make deploying real applications feel simple without abstracting away control; in Encore, devs can define infrastructure directly in Go or TypeScript, and the platform turns that [&hellip;]</p>
<p>Every AI lab is shipping research agents. OpenAI&#8217;s Deep Research, Perplexity, and Gemini&#8217;s research mode. These products are not simple RAG pipelines. Recent papers like DeepResearcher and Step-DeepResearch formalize what makes them work: a recursive loop of planning, searching, learning, and reflecting, where the agent decides when to go deeper and when to stop. The [&hellip;]</p>

Cursor just launched plugins, making it easier than ever to give Cursor structured access to external tools and infrastructure. Neon is part of the initial launch set: you can install the Neon plugin today from the Cursor Marketplace to give Cursor live access to your Neon organization along with the knowledge it needs to be […]

A few weeks ago, Vercel released add-skill (now npx skills), a CLI for installing agent skills across different coding agents and editors like Claude Code, Cursor, and VS Code. It solves a very real problem: each tool looks for agent skills in a different place, which makes setup repetitive and documentation painful to maintain. The […]

“We were getting ready to hire dedicated engineers just to manage and scale Zite Database. With Neon, we didn’t need to do that – we were able to give every end user their own database, including on the free plan” (Dominic Whyte, Co-founder at Zite) Zite is an AI-native app builder for the kind of […]

Last Checked
2h ago
Latest
Jun 10, 2026
Tracking since Feb 4, 2025