Misadventures in Python hot reloading

I made this point on X a few months ago. If you've used any major Node web framework these days, they all bundle a dev server. And that dev server responds within milliseconds to changes on your code. Make a change to your flexbox and it's ready by the time you switch to your browser.

Hot reloading can refer to either of these reload types: injecting new code into the running server process, or modifying the client state without a full page refresh. In Node/Javascript the mechanism is similar.

Python lacks similar support out of the box. Jupyter has a %autoreload, which works okay because most things are in global space. But if you've ever imported a third party package you'll notice that there are times when you capture a reference to a previous function value. It's not intuitive when that happens and when it doesn't. This is a dive into the internals of each.

Hotreloading in Node

Hotreloading in Node isn't a language level feature but the spec has been pretty well formalized in a Webpack RFC since 2014¹. Different bundlers implement their own handling but the general principles still apply.

Node has a bit of an accidental advantage here. Very few people write vanilla Javascript anymore. To make use of modern language features, you're going to need polyfills; to compile any React jsx markup files you need a compiler; to use typescript you need at least a passthrough to tsc.

Since bundlers have established themselves as a near mandatory infrastructure, they can inject all the tricks we need for hot reloading before we actually hit the Node runtime.

AST ownership. They already parse every module to do JSX/TS/SWC transforms, so walking the tree to build a dependency graph is a sunk cost.

When Webpack processes your UserProfile.tsx, it's already transforming JSX into React.createElement calls and TypeScript into vanilla JavaScript. During that same AST walk, it can trivially extract every import statement:

// Your source code
import React from 'react';
import { UserCard } from './UserCard';
import { fetchUser } from '../api/users';

// Webpack already sees this as an AST during transformation:
// ImportDeclaration { source: "react", specifiers: [...] }
// ImportDeclaration { source: "./UserCard", specifiers: [...] }
// ImportDeclaration { source: "../api/users", specifiers: [...] }

Graph ownership. Because they know every "edge", they can do selective invalidation: traverse upward to mark stale parents, downward to bubble updates.

The bundler doesn't just know that UserProfile.tsx imports UserCard.tsx, it knows the complete bidirectional graph. When UserCard.tsx changes, it can instantly compute the minimal set of modules that need to update:

// Dependency graph that Webpack maintains internally
const dependencyGraph = {
  'UserCard.tsx': {
    importedBy: ['UserProfile.tsx', 'AdminPanel.tsx'],
    imports: ['./Button.tsx', './Avatar.tsx']
  },
  'UserProfile.tsx': {
    importedBy: ['App.tsx'],
    imports: ['UserCard.tsx', '../api/users.js']
  }
  // ... rest of your app
}

When UserCard.tsx changes:

Mark UserCard.tsx as stale
Walk up: mark UserProfile.tsx and AdminPanel.tsx as stale
Keep walking: mark App.tsx as stale
Send update message for just these 4 modules, not the whole app

This surgical precision is what makes Vite updates feel instant even on codebases with thousands of files. Only the affected branch of the dependency tree gets invalidated and reloaded.

Runtime shims. Bundlers also inject a few dozen lines that expose a small HMR API. This converts import syntax from their typical static embedding into a dynamic registry that we can update without having to untangle all the nested spaghetti code.

// __hmr_runtime__.js (simplified)
const records = new Map();              // id → { deps, dispose, accept }
export function register(id, mod) { records.set(id, { mod, deps: [] }); }
export function invalidate(id) { … }    // walks records, calls dispose, swaps exports

The bundler wraps every module with this runtime machinery. Your straightforward UserCard.tsx becomes something like:

// Your original code
export const UserCard = ({ user }) => <div>{user.name}</div>;

// Gets wrapped by bundler into:
(() => {
  const module = { exports: {}, hot: null };

  // Your code runs here
  const UserCard = ({ user }) => React.createElement('div', null, user.name);
  module.exports = { UserCard };

  // HMR scaffolding injected by bundler (per-module hot context)
  if (module.hot) {
    // Handle hot updates for this specific module
    module.hot.accept((newModule) => {
      // Swap out the component while preserving React state
    });

    // Access to previous module's data via module.hot.data
    if (module.hot.data) {
      // Handle state from previous hot reload
    }
  }

  __hmr_runtime__.register('UserCard.tsx', module);
})();

The Snake

I've spent a lot of time trying to come up with the best hot reloading experience in Python for web development in Mountaineer. And most of that time was buried deep, deep in the docs and implementation details of importlib. That codebase alone will put some hair on your chest.

In theory you can do everything in Python that you do in Node.² But it's complicated by the way Python handles module imports and reference caching.

importlib & modules

Python 3.0 standardized its import behavior more formally. Unlike Node's require.cache which you can freely manipulate by reference, Python's module system sticks a local function instance in every file that references it. It doesn't pull them dynamically at runtime from some central place, so correcting the record is a lot more fragmented.

Here's the difference in action:

# math_utils.py
print("Loading math_utils module...")

def add(a, b):
    return a + b

def multiply(a, b):
    return a * b

# main.py
import math_utils

print("First import:")
result1 = math_utils.add(2, 3)
print(f"sys.modules keys: {list(k for k in sys.modules.keys() if 'math_utils' in k)}")
# Shows: ['math_utils']

print("Second import:")  
import math_utils  # No print statement - module not re-executed
result2 = math_utils.add(2, 3)

print(f"Same object? {math_utils is sys.modules['math_utils']}")  # True

The module gets executed once and cached in sys.modules. But even that cache is not as you'd expect from require.cache: even if you delete the module from sys.modules it usually doesn't reload bound references on subsequent calls.

Here's also where things get tricky - Python doesn't just cache the module object, it caches all the references to objects within that module:

# Capture a direct reference to the function
my_add_function = math_utils.add
assert my_add_function(2, 3) == 5

# Now try to "reload" the module by updating the file
# math_utils.py gets changed to:
# def add(a, b):
#     return a + b + 100  # Changed implementation

import importlib
importlib.reload(math_utils)

# The module reference is updated
assert math_utils.add(2, 3) == 105

# But our captured reference still points to the old function!
assert my_add_function(2, 3) == 5  # Still returns the old result

Python's approach makes sense for production code - you don't want function references changing out from under you at runtime. But it makes hot reloading much more challenging because you need to track down every single reference to every single object in every single module that might have changed.

The problem compounds when you have deeply nested imports:

# database.py  
class Connection:
    def query(self, sql):
        return "old query result"

# models.py
from database import Connection
db = Connection()  # Captured reference

class User:
    def get_all(self):
        return db.query("SELECT * FROM users")  # Uses captured reference

# main.py
from models import User
user_manager = User()  # Another layer of reference capturing

Now if database.py changes, you need to reload:

database module
models module (so it gets fresh Connection class)
main module (so it gets fresh User class)
Any other modules that imported any of these

Woof.

V1: Separate Process

The most obvious way to guarantee everything is refreshed is to load all dependencies from scratch. This is what most Python web frameworks do in development. FastAPI, Flask, and Django all ship dev servers that watch for file changes and restart the entire Python process when something updates.

Here's a simplified version of how those dev servers works under the hood:

import sys
import subprocess
from pathlib import Path
from watchfiles import watch

def run_server():
    """The actual server process"""
    print("Starting Django development server...")
    # This would be your actual Django/Flask/FastAPI app
    from myproject.wsgi import application
    # Start serving...

def run_with_reloader():
    """Main process that watches files and restarts server"""
    watch_paths = ['./myproject', './templates']

    # Start server in subprocess
    server_process = subprocess.Popen([
        sys.executable, '-c', 
        'from dev_server import run_server; run_server()'
    ])

    try:
        for changes in watch(*watch_paths):
            print(f"Files changed: {changes}")
            print("Restarting server...")

            server_process.terminate()
            server_process.wait()

            # Restart with fresh process
            server_process = subprocess.Popen([
                sys.executable, '-c',
                'from dev_server import run_server; run_server()'
            ])

    except KeyboardInterrupt:
        server_process.terminate()

if __name__ == '__main__':
    run_with_reloader()

This approach completely sidesteps the reference tracking problem. Every restart gives you a completely fresh Python interpreter with no cached modules, no stale references, no import confusion. It's bulletproof.

The trade-off is startup time. Each restart needs to:

Initialize the Python interpreter - import built-in modules, set up the runtime
Load all your dependencies - Django, Flask, SQLAlchemy, requests, etc.
Import your application code - models, views, utilities
Establish database connections - reconnect to PostgreSQL, Redis, etc.
Load configuration - environment variables, settings files
Start the server - bind to ports, initialize middleware

For a small Flask app, this might take 200-500ms. For a large project with dozens or hundreds of transitive dependencies, it can easily take 5+ seconds. As your project grows bigger, the restart time becomes a significant productivity killer. You make a one line change and wait 3 seconds to see the result.

The benefit is reliability. You never have stale state, never have mysterious bugs caused by incomplete reloads, never need to manually restart when things get confused. The original Mountaineer release just shipped with one of these process-based restarts, but I wanted to see if I could do anything a bit smarter.

V2: Runtime Manipulation

Monkeypatching is always a horrible idea, or at least it is during production. But maybe during development it can still get us to where we need.

If we can track which modules are changed, we can selectively invalidate their references in other files. Instead of restarting everything, analyze the dependency graph to figure out exactly which modules need to be reloaded when a file changes.

The key insight is that Python's AST can tell you what each module imports:

import ast
from typing import Set

def analyze_imports(filepath: str) -> Set[str]:
    """Extract all imports from a Python file using AST"""
    with open(filepath, 'r') as f:
        try:
            tree = ast.parse(f.read())
        except SyntaxError:
            return set()

    imports = set()

    class ImportVisitor(ast.NodeVisitor):
        def visit_Import(self, node):
            # Handle: import os, sys, myapp.models
            for alias in node.names:
                imports.add(alias.name)

        def visit_ImportFrom(self, node):
            # Handle: from myapp import models, from .utils import helper
            if node.module:
                imports.add(node.module)

    ImportVisitor().visit(tree)
    return imports

# Example usage
imports = analyze_imports('./myproject/views/user.py')
print(f"user.py imports: {imports}")
# Output: {'myproject.models.user', 'flask', 'typing'}

With this simple visitor, you can analyze every file in your project to build a complete dependency graph. The process would be:

Scan all Python files in your project directory
Extract imports from each file using the AST visitor
Build bidirectional dependency mappings (file_imports and file_importers)
Filter to project files only - ignore third-party and standardlib imports like flask or os

Once you have this dependency graph, you can trace the impact of any file change by walking up the import chain.

When invalidating, you need to execute it from the top of the import chain down. When models/user.py changes, you can't just reload that module and expect everything to work.

Consider this import chain:

main.py              → imports views.user
views/user.py        → imports models.user  
models/user.py       → (the file that changed)

If you only reload models.user, the problem is that views/user.py still has references to the old models.user classes:

# views/user.py - this ran when the file was first imported
from models.user import User  # This captured the OLD User class
current_user = User()         # This created an instance of the OLD class

# Even after models/user.py gets reloaded:
print(User)  # Still points to the old class definition!

The solution is to invalidate the entire chain from top to bottom:

# When models/user.py changes:
del sys.modules['main']           # Remove main module
del sys.modules['views.user']     # Remove views.user module  
del sys.modules['models.user']    # Remove models.user module

# Re-import from the top
import main  # This triggers fresh imports all the way down the chain

When main.py gets re-imported, it imports a fresh views.user, which imports a fresh models.user, ensuring that all references point to the new code. Your database connections stay alive, cached data persists, and startup time drops from seconds to milliseconds.

This approach works well when your assumptions hold:

Usually only one file changes - You're not refactoring across multiple modules simultaneously
Import relationships are clean - Modules don't hold tight references to each other's internals
Dependencies flow upward - Lower-level modules (like models) don't import from higher-level ones (like views)

You need careful bookkeeping of not just imports but also attribute access, function calls stored in variables, and class inheritance. Every captured reference is a potential source of stale state that won't get updated during selective reloading.

The more complex your dependency graph becomes, the more edge cases you encounter. Global variables, decorator usage, metaclasses, and dynamic imports adds another layer of complexity to track correctly. Most attempts at smart Python reloading eventually break down when they encounter these edge cases, leaving you with mysterious half-baked bugs where you have new references some places and outdated references in others.

The only real solution is again to cleanly exit the process and restart. Not ideal.

V3: External Libraries

Most third party libraries take the basic V2 approach (selective module invalidation) but add better handling for edge cases like decorators, class methods, and complex inheritance hierarchies.

Jupyter's %autoreload

You've probably used at least one hot reloading implementation in Python before. Jupyter's %autoreload magic command internally wraps a hot reloading pipeline.

# Enable autoreload in Jupyter/IPython
%load_ext autoreload
%autoreload 2

# Now any imported modules will automatically reload when changed
from myproject.models import User
user = User()  # This will always use the latest User class definition

autoreload works by hooking into Python's import system and checking file modification timestamps before each code execution. When a module has changed, it uses a different strategy for different types of mounted objects³:

Function objects: Replaces the __code__ attribute of existing functions with the new code
Class methods: Updates method definitions on existing class objects
Properties: Handles @property decorators and descriptors
Imported references: Fixes from module import function statements by updating the reference

The magic happens in how it handles the captured reference problem. Instead of just reloading modules, autoreload patches existing objects in place:

# Before reload: User class with old method
class User:
    def get_name(self):
        return "Old implementation"

user_instance = User()

# File gets modified with new implementation
# After autoreload triggers:
# - The User class object stays the same (same memory address)
# - But User.get_name.__code__ gets replaced with new bytecode
# - Existing user_instance automatically uses new implementation

print(user_instance.get_name())  # "New implementation"

This approach is why autoreload works so well in Jupyter notebooks. You can load data, create objects, modify your code, and keep working without losing your runtime state.

The limitations are the same ones that plague all Python hot reloading: changing class structure (adding/removing methods), decorator modifications, and C extensions don't reload cleanly.

Jurigged

jurigged is a bit more surgical than Jupyter's autoreload but works using the same bytecode principles. Change the function while it's running and save the file. jurigged detects the change and:

Parses the new AST
Compares against the old AST to find what changed
Compiles the new function
Uses gc.get_referrers() to find ALL functions using the old code
Replaces their __code__ pointers with the new bytecode

They seem unique in using Python's garbage collector as a code reference tracker. When you change a function, jurigged doesn't need to maintain dependency graphs or track imports - it can just consult the GC for what objects are currently pointing to this old code object. Then it can replace them all simultaneously:

import gc

def find_all_functions_with_code(target_code):
    """Find every function object that uses specific bytecode"""
    functions = []

    # Ask GC for everything that references this code object
    referrers = gc.get_referrers(target_code)

    for obj in referrers:
        if hasattr(obj, '__code__') and obj.__code__ is target_code:
            functions.append(obj)
        # Also check for closures, bound methods, etc.

    return functions

# This is essentially what jurigged does internally
old_code = fibonacci.__code__
all_fibonacci_functions = find_all_functions_with_code(old_code)

# Replace __code__ on all of them with new implementation
for func in all_fibonacci_functions:
    func.__code__ = new_fibonacci_code

Very similar in spirit to Jupyter. Just a bit more nuanced in how they calculate the code to update, which tends to be the area that will trip you up at runtime.

V4: 🔥 firehot

What if we tried to take the best of both worlds? Fast reloading like V2/V3's selective invalidation but guaranteed to be as reliable as V1's process restarting? Sounds too good to be true.

That's where Firehot comes in. I write up more of a deep dive about the library in the release post. But at a high level:

Create a separate Python process/interpreter with just third-party dependencies imported by importlib
Every time our webapp files change, we fork that base process and then only load the new client module
The client module will load all its code from scratch (since it's never been imported before in this process context) but skip over the third party dependencies that we've already imported.

The approach works well because it leverages Python's import caching at the right granularity. Third-party modules get the benefit of persistent caching (they never reload), while project modules get the benefit of clean invalidation (they always reload completely).

There are still edge cases where this breaks down:

Shared State: If your project requires some global mutable state that you expect to be available for the whole process lifecycle, we will throw away that state every time the application reloads since we are actually tearing down the process and recreating it with the full code from scratch.

C-Level Threading: Unix (macOS and Linux) operating systems can only safely fork code when there aren't any background threads operating, because they risk deadlocks when child processes inherit locks from threads that no longer exist after the fork.

But in a typical web development workflows, you're iterating on views, models, and business logic while keeping the same database connections and framework components. Here firehot provides speed close to selective reloading with much better reliability than AST-based dependency tracking.

Not all modules are created equal. Your project code changes constantly and benefits from aggressive reloading. Third-party dependencies change rarely and benefit from aggressive caching. By treating them differently, you can optimize the reload strategy for each.

Conclusion

Javascript and Python are my two longest running programming language loves. For their differences they have a lot of similarities (interpreted languages, thriving package ecosystems, type systems added after the fact).

It's an accidental twist of the ecosystem that one ended up with hot-reloading as a basically solved problem and the other still struggles with it. Node's necessity for bundlers created the perfect infrastructure for hot reloading, while Python's more straightforward module system made it harder to retrofit. But the gap is closing. And with enough creativity and brute force, we can bring that same instant feedback loop to Python development. The web development experience shouldn't depend on which language you choose.

Credit here to Tobias Koppers in Hot Module Replacement. It formalized the accept/dispose-hook graph traversal used by all modern bundlers. ↩
I mean, they're both turing complete languages after all. ↩
There certainly feel like way more primitives in Python than in Javascript. So much of the language constructs that I use daily in Typescript are just compiled down to functions when they hit a build pipeline. ↩
Ours at MonkeySee is now 500+ files with a whole bunch of interconnected dependencies. Process restarting was taking 4-5 seconds on every change, which made rapid iteration pretty painful. ↩
CommonJS was a community effort - and had to compete with a ton of other module projects to win the crown. Before Node adopted it as a native syntax, you had to import additional polyfills to make it work. ↩
Only now in 2025 does it finally feel like these are largely settled questions. ↩