httpx is the right way to do web requests in Python

As even more webapps become shims on top of API services and those services have latency measured in seconds or minutes¹, the ergonomics of network requests become pretty important. The gold standard here is probably fetch() in Javascript: it's simple, powerful, and baked into most browsers.² It's an industry standard for a reason.

Python lacks a stdlib answer with the same kind of adoption. For production applications I think there are three main things you want out of your framework:

Async support to avoid blocking the event loop
Easily testable so you're not making real requests in your tests
OpenAPI client generation support for any internal service communication

Even though aiohttp still wins on raw throughput, my default choice has become httpx. You can nearly always find it in my pyproject.tomls. Here's why.

async clients

I haven't ever used the sync clients provided by httpx, although they provide these too. The compelling functionality is really on the async side.

Technically httpx doesn't handle networking itself - it delegates to httpcore, which provides the low-level HTTP transport implementation. When you create an AsyncClient, httpx instantiates an AsyncHTTPTransport that wraps httpcore's AsyncConnectionPool. That's a mouthful.

When you await client.get(), it:

Acquires a connection from the pool via _assign_requests_to_connections()
Delegates to asyncio - each connection wraps an asyncio.StreamReader/StreamWriter pair
Shares connections in a connection pool³

If you're stuck in a synchronous codebase, you can certainly use threading to provide some concurrency that's similar to asyncio. But you're system limited to only being able to parallelize so many threads at the same time before you start reaching diminishing returns of CPU switching. With asyncio you're really just issuing all network requests up to the kernel / networking card and waiting to receive the signal of the data coming back over the stream. There you're more limited by file descriptors and DNS resolution.

Connection pooling can give a meaningful performance advantage if you're making a lot of repeated requests to a single domain over TLS (which is the default on most of the web these days). The numbers are illustrative. A fresh HTTPS connection still needs:

DNS lookup (~20-100 ms)
TCP handshake (1 RTT, ~50-200 ms depending on geography)
TLS handshake (1 RTT on TLS 1.3, 2 RTTs on TLS 1.2)
HTTP request/response (1 RTT + server time)

That’s half a second of ceremony per socket if you don’t pool. httpx’s AsyncConnectionPool amortises all but #4 after the first hit.

pytest-httpx intercepts

Most HTTP mocking libraries patch at the unittest.mock.patch() level. This ends up being pretty brittle because patches in python need to occur before the in-memory package is imported, otherwise you can often patch the wrong object that won't affect what's in runtime. If you (or your mocking library) aren't careful, you can inadvertently make real outbound requests when you're running unit tests.

Because of the abstraction layers in httpcore transports, pytest-httpx can do something that's more reliable. The AsyncRequestInterface defines a single method that we can override for a fresh interface:

# From httpcore/_async/interfaces.py
class AsyncRequestInterface:
    async def handle_async_request(self, request: Request) -> Response:
        raise NotImplementedError()

pytest-httpx provides a MockTransport that completely replaces the real transport.

class MockTransport:

    async def _handle_async_request(
        self,
        real_transport: httpx.AsyncHTTPTransport,
        request: httpx.Request,
    ) -> httpx.Response:
        # Store the content in request for future matching
        await request.aread()
        self._requests.append((real_transport, request))

        callback = self._get_callback(real_transport, request)
        if callback:
            response = callback(request)

            if response:
                if inspect.isawaitable(response):
                    response = await response
                return _unread(response)

        self._request_not_matched(real_transport, request)

This lets pytest-httpx operate at the same level as the real HTTP transport. Your code paths are identical; the only difference is that MockTransport.handle_async_request() returns your pre-configured response instead of making a real network call.

The API stays the same on the httpx side while being able to swap this transport in to the global singleton anywhere httpx is used.

OpenAPI Support

There's now pretty solid community support for automatically generating a typehinted httpx client from an OpenAPI spec. This isn't baked into httpx directly but their jinja templates rely on httpx for the main IO layer.

Since most modern libraries either give you OpenAPI support out of the box or bake it into the design assumptions, this creates pretty a seamless workflow from API definition to client code.

1. Define your API spec (user-api.yaml):

Some people write these by hand but I've always preferred marking up my webapp code with the output spec inline. It lets you change your API in oneshot instead of having to go through the OpenAPI spec, regenerating a template, and then filling in the service code.⁴

openapi: 3.0.0
info:
  title: User Management API
  version: 1.0.0
servers:
  - url: https://api.example.com/v1
paths:
  /users:
    get:
      operationId: list_users
      parameters:
        - name: limit
          in: query
          schema:
            type: integer
            default: 10
      responses:
        '200':
          description: List of users
          content:
            application/json:
              schema:
                type: object
                properties:
                  users:
                    type: array
                    items:
                      $ref: '#/components/schemas/User'
components:
  schemas:
    User:
      type: object
      properties:
        id:
          type: string
          format: uuid
        email:
          type: string
          format: email
        name:
          type: string
        created_at:
          type: string
          format: date-time
        is_active:
          type: boolean
      required:
        - id
        - email
        - name
        - created_at
        - is_active

2. Generate the client:

uvx openapi-python-client generate --path user-api.yaml

3. Use the generated client:

And with that one uvx script, you've got yourself a client.

Distinguishing between sync and asyncio - plus the more detailed extensions sync_detailed and asyncio_detailed - require a slightly verbose syntax. But gives you all the flexibility of generic OpenAPI schemas with structured typehinting. I think it's well worth it.

from user_management_api_client import Client
from user_management_api_client.api.default import list_users, get_user
from user_management_api_client.models import User
from uuid import UUID

async with Client(base_url="https://api.example.com/v1") as client:
    # Same methods, but async
    users_response = await list_users.asyncio(client=client, limit=5)

Within this library we have:

Full type safety: The User model is a proper dataclass with UUID, datetime, and email validation
Both sync and async: Every endpoint gets .sync() and .asyncio() variants just in case you're still stuck in sync land
Rich response parsing: Access both parsed data and raw HTTP responses
Automatic serialization: JSON request/response handling with proper Python types

It makes for a pretty smooth development loop: define your API contract on the server side, generate type-safe clients, and get all the benefits of httpx's async support and connection pooling without writing any networking code yourself.

Other asyncio libraries

I'm not an extensive expert on all of the other asyncio networking libraries but I have tried a handful of them. Here's how the major alternatives compare against httpx's three key strengths:

Library	Async	Testing story	OpenAPI tooling	When to pick
aiohttp	First-class	Manual mocks	None built-in	Max raw performance & you control the stack
urllib3	– (sync only)	Patching	–	Need the absolute lowest-level knobs
grequests	Gevent monkey-patch	Requests-style	Sync-only	Quick wins on legacy code
uplink	Backend-agnostic	Follows chosen client	Declarative, not auto-gen	Retrofit-style APIs with pluggable transports

Notable omissions: pycurl (C-speed but ugly API), trio’s clients (different concurrency model).

Just use it™

I'm officially⁵ nominating httpx for the defacto industry standard for web networking in Python in 2025. It feels like the closet alternative to fetch that we have - decent performance, well designed, and easy to test.

We probably would have said the same thing about requests and the related responses mocking library until the async redevelopment of Python threw the package ecosystem upside down. But in the calm after the storm there are some libraries that are becoming that same scope of standard again. I'm here hoping it's httpx.

And if it is, let's see how long it can keep the title.

LLMs agents, I'm looking at you. ↩
Also recently into server side libraries like Node and Bun. The only thing it doesn't really have support for is good progress bar tracking. ↩
http2 mitigates the need for this somewhat by having built-in multiplexing support. ↩
I recognize the arguments that forcing this process is a better way to ensure that your APIs are reverse compatible for old clients. But imo, these checks are better left for static analysis not for encouraging a more circuitous process during development. ↩
Right here, right now. ↩
You'd think intuitively that websockets would be perfect for these send-receive message interactions. But I'm not aware of any provider that actually implements this for their chat API: they all just have a message endpoint and then a server streamed response. That's in part because it ends up being a much simpler spec and better supported: Websockets are still missing a lot of the http plumbing that we typically take for granted (like: headers). ↩
Performance benchmarks from Bright Data's HTTP client comparison, measuring 1000 concurrent requests. ↩
Performance figures from liblab's HTTP client analysis, measuring synchronous request throughput. ↩