David Carlier's blog: One byte past the end: a UTF-8 over-read in nginx's charset filter

Audit passes over nginx mostly turn up noise. Every so often something
real falls out. This one was hard to believe at first — the bug is old,
the code is short, and the conditions to hit it are narrow enough that
it's probably why nobody had tripped on it before. But it's a real
out-of-bounds read, and with the right config a worker process ends up
reading a byte it has no business touching.

The fix landed in mainline as commit `f3cc87321`, "Charset: fix buffer
over-read in recode_from_utf8()." What follows is how the bug works and
how it was found.

What the code is doing

If you point `source_charset` at utf-8 and `charset` at something
single-byte, nginx transcodes response bodies on the fly. The function
doing the work is `ngx_http_charset_recode_from_utf8` in
`src/http/modules/ngx_http_charset_filter_module.c`. It walks the input
buffer a byte at a time, passes ASCII straight through, and when it
hits a byte ≥ 0x80 it calls `ngx_utf8_decode` to pull a full code
point, then looks that code point up in a page table built from the
user's `charset_map`.

The annoying case is when a multi-byte UTF-8 character is split across
buffers. E2 98 83 is a snowman (U+2603). If you get E2 in one buffer
and 98 83 in the next, the filter has to remember the E2. It stashes
partial bytes in `ctx->saved[4]` with `ctx->saved_len` tracking how
many bytes are in there. Next time the filter is called with saved
bytes present, it takes a separate code path — the continuation path —
to finish out the pending character before resuming normal work.

That continuation path is where the bugs live. Two of them, both
revolving around the same loop variable.

The pre-fix continuation block

Here's what the code looked like before the patch, stripped of
scaffolding:

p = src;

for (i = ctx->saved_len; i < NGX_UTF_LEN; i++) {
    ctx->saved[i] = *p++;

    if (p == buf->last) {
        break;
    }
}

saved = ctx->saved;
n = ngx_utf8_decode(&saved, i);

...

} else if (n == 0xfffffffe) {

    /* incomplete UTF-8 symbol */

    if (i < NGX_UTF_LEN) {
        ...
        ngx_memcpy(&ctx->saved[ctx->saved_len], src, i);

        ctx->saved_len += i;
        return out;
    }
}

Read the loop carefully. After writing a byte into `ctx->saved[i]`,
it checks whether `p` has walked off the end of the input buffer, and
breaks if so. When it breaks, `i` is the **index of the byte just
written** — not the count of bytes added. If `saved_len` was 2 coming
in, and the new buffer had exactly one byte in it, the loop writes
that byte at `ctx->saved[2]` and breaks with `i == 2`.

Now look at what happens to `i` after that.

`ngx_utf8_decode(&saved, i)`. The second argument to
`ngx_utf8_decode` is a byte count — the function uses it to check
whether enough bytes are available after the leading byte
(`if (n - 1 < len) return 0xfffffffe;`). Passing the index instead of
the count means we tell the decoder we have one fewer byte than we
actually do. For our E2 98 83 example we tell it `n == 2` when
three bytes are sitting in the buffer. It looks at the leading E2,
sees it needs two more bytes, concludes from the too-small `n` that
they aren't there yet, and returns 0xfffffffe — incomplete.

Except it isn't incomplete. The character is *right there*. The
decoder was just lied to about how many bytes were available.

The function then takes the "incomplete" branch, and hits bug
number two:

ngx_memcpy(&ctx->saved[ctx->saved_len], src, i);

`i` again. This memcpy is trying to stage the new bytes from the
current input buffer into `ctx->saved`. The correct number of new
bytes is `i - ctx->saved_len + 1` (since `i` is an index and the
loop started at `ctx->saved_len`). The code passes `i`, which is
strictly more. For `saved_len == 2, i == 2`, that's 2 bytes copied
from a pointer with exactly 1 byte of valid input behind it.

That's the out-of-bounds read. One byte past `buf->last`. Whatever
the pool allocator left there — another request's data, a pool header,
a guard byte, whatever — gets copied into `ctx->saved[3]`.

`ctx->saved_len += i` then takes `saved_len` to 4, which for a 3-byte
character is structurally impossible and is the cleanest fingerprint
of the bug.

Why this has been sitting there

The honest answer is that nobody has really been looking at this
function, and the reason is that it has trouble firing in production.
You need:

1. A `source_charset utf-8` / `charset <single-byte>` pair with a
real `charset_map ... utf-8 { ... }`.
2. Actual non-ASCII UTF-8 bytes in the response body.
3. That UTF-8 character delivered to the filter across **three or
more** separate input buffers. Two won't do it — the loop writes
everything in one go and `i` ends up at `NGX_UTF_LEN - 1`, which
skips the `i < NGX_UTF_LEN` branch entirely. You need a 3-byte
(or 4-byte) character arriving one byte at a time.
4. Something in the request path that prevents coalescing. In
practice this means `proxy_buffering off` with an upstream that
trickles bytes.

Most deployments that use the charset filter at all are using it with
cached or fully-buffered responses, so the entire body arrives as one
buffer and the saved_len path is never entered. The deployments most
likely to hit it are the ones doing `proxy_buffering off` streaming
from a slow origin — SSE, long-poll, any chunked-transfer setup where
the upstream is on the other side of a saturated link. That's not a
large population, but it is non-empty and it's growing.

So the window is narrow, but it's not closed, and "this config is
uncommon" is not really a mitigation.

What an attacker actually gets

Worth being careful here, because the instinct with "out-of-bounds read
in a widely-deployed web server" is to reach straight for the CVE badge
and start talking about Heartbleed. This is not that.

What you get is:

- **Exactly one byte** of over-read per triggering sequence.
- The byte read is whatever sits one past `buf->last` in whatever
pool the filter's input buffer came from. That's usually
request-scoped memory — the same connection's next allocation, or
pool header metadata. It doesn't look like a general route to
cross-request memory, because nginx pools are per-request.
- The byte doesn't get sent over the wire directly. It gets copied
into `ctx->saved[3]`. From there it may or may not influence the
output, depending on what path the rest of the function takes.

In the buggy version, `ctx->saved_len` ends up at 4 after this, and
later in the function there's a computation that takes `src` back to
`buf->pos + ctx->saved_len - NGX_UTF_LEN`. With `saved_len == 4` and
`NGX_UTF_LEN == 4`, that's `buf->pos` exactly — which is the
*correct* destination for the non-buggy case where `saved_len` is
really 3. So the downstream recode loop starts one byte too early
and emits one extra byte of output compared to the all-at-once path.
That extra byte is the corruption. Whether it's the over-read byte
or not depends on the charset table.

So the realistic impact is: a one-byte memory-safety hole that could
leak a byte of pool-adjacent memory into an output stream if the
configuration and timing align. Plus the usual ambient risk of any
out-of-bounds read — tripping ASan in fuzzing, surviving until
someone else finds a way to amplify it, reminding us that the next
subtle bug in the same function might have less polite consequences.

Probably doesn't need a CVE. Does need a fix, which is why one went
upstream.

Building a reproducer

Two things had to be proven before this was worth believing:

1. That the over-read actually happens on a real build.
2. That the observable output differs between the correct and
buggy code paths — i.e. that the bug escapes the function and is
not just an internal mis-accounting that the rest of the code
happens to paper over.

The reproducer is a small Python script — `t/charset_utf8_overread.py`
on the fix branch — that spins up:

- A fake upstream that, for `GET /slow`, sends the response body one
byte every 150ms with `TCP_NODELAY` set. That's the magic
ingredient: each byte becomes a separate recv() in the worker, and
therefore a separate input buffer in the filter chain.
- An nginx config with `proxy_buffering off`, `source_charset utf-8`,
and a `charset_map test-charset utf-8 { C0 D090; }`. The entry
itself doesn't matter; what matters is that `utf-8` is the
**second** argument to `charset_map`, which is what makes nginx
build the page-table structure the from_utf8 code path expects.
(Put utf-8 first and the table is laid out as a flat 256-byte array;
the from_utf8 path then dereferences it as pointers and the worker
crashes on startup before the real bug even runs. That dead end ate
an embarrassing amount of time before the config parser made it
obvious.)
- A response body of `AAAA\xe2\x98\x83BBBB` — plain ASCII, a snowman,
more plain ASCII. The snowman has no map entry, so on the correct
path it's emitted as the HTML numeric entity `☃`.

The script makes two requests: `/fast` (body in a single sendall)
and `/slow` (the drip-feed). On a correct build both return identical
bodies. On the buggy build they don't: `/slow` comes back with an
extra byte, and if nginx was built with `--with-debug`, the error log
contains a `http charset utf saved: 4` line that is the signature of
bug #1. `saved_len == 4` for a 3-byte character is not reachable by
correct code.

The reproducer is also linked as a gist on the PR. It runs in under
three seconds and prints a clear pass/fail.

The fix

The diff is small enough to fit on a postcard:

len = ngx_min(NGX_UTF_LEN - ctx->saved_len, (size_t) (buf->last - src));
ngx_memcpy(&ctx->saved[ctx->saved_len], src, len);
len += ctx->saved_len;

saved = ctx->saved;
n = ngx_utf8_decode(&saved, len);

...

} else if (n == 0xfffffffe) {

    /* incomplete UTF-8 symbol */

    if (len < NGX_UTF_LEN) {
        ...
        ctx->saved_len = len;
        return out;
    }
}

Three things changed.

The byte-at-a-time loop is gone. In its place, `ngx_min` works out
how many new bytes are available and how many we have room for, and
a single `ngx_memcpy` moves them. The compiler can fold that into a
register move for the small sizes we're dealing with; the old loop
couldn't. In a cold path that doesn't matter much, but it does no
harm.

`len` is now a byte count from the start, and it's the only variable
doing that job. No more confusion between "last index written" and
"bytes staged." The decoder gets `len`, the "is this complete" check
compares `len` against `NGX_UTF_LEN`, and the commit writes
`ctx->saved_len = len`. One variable, one meaning.

The second `ngx_memcpy` — the one that was staging new bytes on
the incomplete branch — is gone. It was always redundant; the
bytes were already in `ctx->saved` from the top-of-block copy. The
bug made it look necessary because the bug made the first copy
lie about how many bytes were there.

Net: about twenty lines out, fourteen in, one local variable
deleted, one memcpy removed.

Review feedback worth remembering

The first version of the patch used an explicit pointer cursor — a
`u_char *q` walking `ctx->saved` — and two memcpy calls. Correct, but
it looked like someone had sanded the edges off the original loop
rather than replaced it.

Sergey (pluknet) pushed back on exactly that in review. Two points
worth keeping:

First, an explicit byte-copy loop does not fold into memcpy cleanly.
Even the re-written loop cost roughly 1.5k instructions in the
non-inlined function compared to the `ngx_min` + `memcpy` form. It's
a cold function and nobody cares in a microbenchmark, but the
principle generalises — byte-at-a-time loops are almost always wrong
in this codebase, and "almost always" eventually collides with a hot
path.

Second, if the function already has a `size_t` local that means the
same thing as the pointer arithmetic being introduced, use the
`size_t`. Inventing a `q` just to write `q - ctx->saved` later is
duplicate state, and in this function it's the kind of duplicate
state that caused the original bug.

Both are obvious in retrospect. They are less obvious when the
first instinct in front of a bug is "make the loop correct" rather
than "delete the loop."

Takeaways

Two things, one technical and one about how to look.

The technical one: off-by-ones hide best when the variable has two
valid readings. Here, `i` was both "index just written to" and
"number of bytes staged." The two readings differ by exactly one.
For most inputs the paths where they'd diverge are never reached;
for a few narrow configs they are reached, and the divergence is
the bug. When a variable in a function seems to play two roles,
that's worth about fifteen minutes.

The other one: non-buffered streaming is where quiet bugs wake up.
Buffering coalesces input and hides buffer-boundary assumptions.
`proxy_buffering off` strips that away, and a fair amount of code in
web servers — transcoders, gzip, chunked decoders, TLS record
handlers — was written assuming the coalescing would be there. Any
code that reassembles something across buffer boundaries deserves a
pass against a drip-feed upstream at least once. Ten minutes of
Python, three seconds of test, and sometimes it turns up something
worth writing up.

*Bug: buffer over-read in `ngx_http_charset_recode_from_utf8()`.*
*File: `src/http/modules/ngx_http_charset_filter_module.c`.*
*Fix: commit `f3cc87321` upstream. PR nginx/nginx#1261.*
*Reproducer: `t/charset_utf8_overread.py`.*
*F5 advisory: K000161028.*

Labels: charset filter, CVE-2026-42934, nginx