Middleware is an Onion
A Starlette middleware is an ASGI app wrapping an ASGI app. A response is an ASGI app too. BaseHTTPMiddleware bridges them with a task group and a memory stream — convenient, and the exact place a streaming disconnect goes to die.
Part 3 called the request stack "four layers, each adding opinions." This is the post that opens the layers and reads them. By the end, the leak from the field report isn't a surprise — it's the only thing that could have happened.
Everything here is built on two facts. From Part 3: an ASGI app is a callable async def app(scope, receive, send), and HTTP is three message types flowing over receive and send. From Part 4: async doesn't mean "yields," anyio cancels lexical scopes by deferring delivery to the next checkpoint, and a task group's cost is paid at the join. Keep both within reach; this post spends them continuously.
§1An app wrapping an app
The ASGI signature is recursive in the most useful way: anything with the shape (scope, receive, send) can stand in for anything else with that shape. A middleware is just an app that holds a reference to the next app and calls it. The minimal pure-ASGI middleware is six lines and adds nothing:
class Passthrough: def __init__(self, app): self.app = app # the next layer in async def __call__(self, scope, receive, send): await self.app(scope, receive, send) # hand the same three down
To do something, a middleware wraps one of the three before passing it down. Wrap send to observe or rewrite responses; wrap receive to observe or inject requests; read scope to route. Here is a real one — request timing, in pure ASGI:
class TimingMiddleware: def __init__(self, app): self.app = app async def __call__(self, scope, receive, send): if scope["type"] != "http": await self.app(scope, receive, send) return start = time.monotonic() async def send_wrapper(message): if message["type"] == "http.response.start": record_status(message["status"]) await send(message) # straight to the wire try: await self.app(scope, receive, send_wrapper) finally: observe_duration(time.monotonic() - start)
Pure ASGI: send_wrapper forwards each message straight to the real send. No buffering, no second task. The bytes the app emits go to the wire untouched.
Notice what this does not do: it does not collect the response, does not spawn a task, does not interpose a queue. Each message the downstream app sends passes through send_wrapper and out to the real send in the same call, on the same task. When the client vanishes and the real send raises OSError, it raises straight up through self.app — the app finds out immediately. Hold that thought; it's the whole contrast with §3.
Each middleware is an ASGI app holding the next. scope/receive/send thread inward to the handler; response messages thread back out. A pure-ASGI layer passes them through on the same task.
§2A response is an app too
The recursion goes one layer further than most people notice. When your handler returns a Response, the framework doesn't read its bytes. It calls it — because a Starlette Response is itself an ASGI app. Real source:
# starlette/responses.py — Response.__call__ async def __call__(self, scope, receive, send) -> None: await send({"type": "http.response.start", "status": self.status_code, ...}) await send({"type": "http.response.body", "body": self.body})
So the "return a response" model and the "be an ASGI app" model are the same model. A route handler is an app whose body produces a Response, which is an app that emits the messages. This is why the SSE endpoints in the field report can hand back either a StreamingResponse subclass or a bare async generator — both bottom out in something that gets __call__ed with (scope, receive, send). And it's why a response can run its own disconnect listener on receive: it has one, because it's an app.
Which raises the question the rest of the post answers: if both middleware and responses are apps that get the real receive and send, where does the clean pass-through of §1 break? Answer: the moment a middleware wants to treat the response as a Request/Response pair instead of as a stream of messages. That's the convenience BaseHTTPMiddleware sells.
§3The convenience that costs
BaseHTTPMiddleware exists because pure ASGI is tedious for the common case. You usually want to write async def dispatch(request, call_next): response = await call_next(request); ...; return response — a function that takes a Request object and returns a Response object, like a tiny handler. But the ASGI app underneath doesn't deal in objects; it deals in message callbacks. To offer the object API, BaseHTTPMiddleware has to adapt the callback world into the object world. That adapter is the whole problem.
The adapter runs the downstream app in a separate task, captures the messages it sends into a memory object stream (Part 4 §7), and rebuilds them into a Response on the other side. Real source, condensed to the spine:
# starlette/middleware/base.py — BaseHTTPMiddleware.__call__ (spine) send_stream, recv_stream = anyio.create_memory_object_stream() # buffer size 0 async def coro(): with send_stream: await self.app(scope, receive_or_disconnect, send_no_error) # app's send → stream async with anyio.create_task_group() as task_group: task_group.start_soon(coro) # downstream app runs HERE message = await recv_stream.receive() # middleware reads it back out ... response = _StreamingResponse(status=message["status"], content=body_stream()) await response(scope, wrapped_receive, send) # re-emitted to the REAL send
The downstream app no longer talks to the wire. Its send writes into a memory stream; a second response object, driven by the middleware, reads that stream and writes to the real send.
Trace one response chunk. The handler calls its send — which is send_no_error — which does await send_stream.send(message). The memory stream has buffer size zero (Part 4 §7: a rendezvous), so that send blocks until the middleware's body_stream calls recv_stream.receive() and takes it. The middleware, wrapped in its own _StreamingResponse, then writes the chunk to the real send. Every byte the app produces crosses a task boundary and a rendezvous before it reaches uvicorn.
The downstream app runs in a child task; its send writes into a zero-buffer memory stream. The middleware (host task) reads the stream and re-emits through a second response to the real send. Two tasks, one rendezvous, between the app and the wire.
This works. It is also, for a streaming response, a machine for losing disconnects — and to see why, we read the one function that wraps receive.
§4receive_or_disconnect, decoded
The downstream app's receive is not the real one either. It's receive_or_disconnect, and with Part 4 in hand it reads cleanly now. Real source:
# starlette/middleware/base.py async def receive_or_disconnect(): if response_sent.is_set(): return {"type": "http.disconnect"} async with anyio.create_task_group() as task_group: async def wrap(func): result = await func() task_group.cancel_scope.cancel() # first one home cancels the other return result task_group.start_soon(wrap, response_sent.wait) # sibling A: "response done?" message = await wrap(wrapped_receive) # sibling B: real receive return message
It races two waiters in a task group. Sibling A waits on response_sent; sibling B waits on the real receive. Whichever returns first calls cancel_scope.cancel() to tear the other down. The intent is reasonable: stop waiting for input the moment the response is finished.
But recall Part 4 §6: a task group cannot exit until its children join, and joining a live sibling forces __aexit__ to await. So receive_or_disconnect always suspends at least once before returning — there is always a sibling to wind down — even when wrapped_receive had a message ready and returned synchronously. That forced suspension is the checkpoint that destroys the pre-armed peek from Part 4 §5.
Walk it through. Upstream, some code calls request.is_disconnected() — the pre-armed cancel scope, betting that await self._receive() returns without suspending. But self._receive() is this receive_or_disconnect. It spawns a sibling, and its task-group join suspends. The pre-armed cancellation, deferred by call_soon (Part 4 §4), lands precisely on that join. is_disconnected() catches the cancellation and returns False — every time, even with http.disconnect sitting in the real queue. The peek is structurally blind behind this middleware.
Wrapping receive in a task group turns a synchronous read into a guaranteed suspension. The cheap peek upstream never had a chance.
§5Where the onion leaks
Now combine §3 and §4. In a streaming response behind BaseHTTPMiddleware, the disconnect signal has to travel a path the bridge has quietly cut in two places.
On the send side. When the client disconnects mid-stream, the real send raises OSError (the ASGI ≥ 2.4 convention — §6). But the real send is now being driven by the middleware's own _StreamingResponse on the host task, not by the downstream app. The OSError tears down the middleware's response. The downstream app — the actual SSE handler — is in a different task, blocked on send_stream.send(chunk) waiting for a rendezvous that will now never come, or looping on its own queue. It learns of the disconnect only indirectly: its next send_no_error sees a BrokenResourceError and silently returns (real source, §3), or its next receive_or_disconnect returns http.disconnect because response_sent got set. An idle SSE handler that is doing neither — just polling its message queue on a one-second timeout — sits there indefinitely.
On the receive side. The handler's defensive move would be to poll is_disconnected(). §4 just showed that's structurally blind behind this middleware. So the one mechanism that could have rescued an idle handler is disabled by the same construction that buffers the response.
That is the field report's leak, stated from the middleware's side rather than the handler's. The handler doesn't notice the client is gone, because every channel by which it could have noticed runs through a bridge that either swallows the signal or defers it past the point the handler is watching.
BaseHTTPMiddleware + a streaming response = the app's send and receive are both proxies that absorb the disconnect. The app keeps running because, from where it sits, nothing happened.
§6The spec_version escape hatch
How does a disconnect become an OSError on send in the first place? It's a versioned contract between the server and Starlette. StreamingResponse.__call__ branches on it — real source:
# starlette/responses.py — StreamingResponse.__call__ spec_version = tuple(map(int, scope.get("asgi", {}).get("spec_version", "2.0").split("."))) if spec_version >= (2, 4): try: await self.stream_response(send) except OSError: raise ClientDisconnect() else: async with anyio.create_task_group() as task_group: async def wrap(func): await func() task_group.cancel_scope.cancel() task_group.start_soon(wrap, partial(self.listen_for_disconnect, receive)) await wrap(partial(self.stream_response, send))
ASGI 2.4 (March 2022) added the rule that the server raises OSError from send() when the transport is gone. If the server advertises spec_version ≥ 2.4, Starlette trusts that and just streams, catching OSError as a disconnect. If not, it falls back to the old way: spawn a listen_for_disconnect task that reads receive in parallel — the same task-group-race shape as receive_or_disconnect.
This branch is why the field report's handler-level watcher is safe in that stack: uvicorn 0.30.6 advertises 2.4, so the response takes the top path and never spawns a competing reader on receive — leaving the watcher as the sole consumer. On an older server, the bottom branch wakes up and the two would race. The safety is a property of the version number, not the code. The same fact, read from here, also tells you that fixing the middleware doesn't remove the need for someone to consume receive: the response detects disconnect through send, not receive, so an idle generator still needs its own answer.
§7Pure ASGI, and when to pay
The cure Starlette's maintainers endorse for anything streaming or disconnect-sensitive is to drop BaseHTTPMiddleware and write the layer as the §1 pure-ASGI form: wrap send/receive, pass them down on the same task, no buffer, no second task. The disconnect OSError then propagates straight up through your self.app the way §1 promised. You lose the ergonomic Request/Response object API; you gain transparency and, per the maintainers' benchmarks, a meaningful throughput win.
| BaseHTTPMiddleware | Pure ASGI | |
|---|---|---|
| API | Request → Response objects | raw (scope, receive, send) |
| downstream app runs | in a child task | on the same task |
| response bytes | through a rendezvous stream | straight to the wire |
| disconnect OSError | caught by the bridge's response | propagates up to the app |
is_disconnected() upstream | structurally returns False | works |
| cost | convenience now, leaks under streaming | more code per layer |
The rule of thumb: BaseHTTPMiddleware is fine for layers that read scope and maybe rewrite a header on a buffered response. The moment a layer sits in front of a streaming or long-lived endpoint — SSE, chunked downloads, websockets-adjacent HTTP — it should be pure ASGI, because the bridge's task-and-buffer construction is exactly wrong for a response that never ends. The field report applies exactly this rule to a five-layer stack — converting the layers in front of a streaming endpoint, leaving the buffered-only ones — and watches the leak it causes when you don't.
§8Take with you
- Middleware and responses are both ASGI apps. Same
(scope, receive, send)shape, all the way down. A middleware wraps one of the three; a response emits messages. - Pure-ASGI layers pass through on one task. The disconnect
OSErroronsendpropagates straight up to the app. BaseHTTPMiddlewaretrades that for an object API. It runs the app in a child task and bridges responses through a zero-buffer memory stream — a second response re-emits to the real wire.- That bridge wraps
receivein a task group, which (Part 4 §6) forces a suspension on every read — killing the upstream pre-armed peek and structurally blindingis_disconnected(). - Disconnect arrives via
send(OSError, ASGI 2.4), notreceive. The bridge catches it on its own response, so the app never feels it — the streaming leak in one sentence. Pure ASGI for anything streaming.
Parts 1–5 built the stack from the GIL up to the middleware onion. The field report is what it looks like when the whole stack conspires into one production bug: an SSE task count climbing 500 → 2,200 with no error in the log, traced through the exact receive_or_disconnect and spec_version mechanics this post laid out — and the three fixes, ranked by blast radius.