Go ahead, self-host Postgres
# July 2, 2025
Self-hosting a database sounds terrifying. That narrative has certainly been pushed over the last 10 years by the big cloud providers:
- Hosting your own database is dangerous
- How are you going to get all the 9s of reliability doing it yourself?
- You'll have access to dedicated database engineers that you couldn't (or wouldn't want to) hire yourself
The rumors obscure the truth.
- Most cloud hosts are just running a slightly modified version of the open source Postgres server anyway1
- Database engineering is not a silver bullet if your queries are sub-optimal. Abstracting away your engine too much from your code doesn't let you benchmark what's going on and work around the otherwise reasonable constraints of how the engine is actually querying your code.
I've had data corruption when using a 3rd party vendor just the same as I've had when self-hosting. And with a serious markup, what's the point?
I've been running my own self-hosted postgres for the better part of two years now, serving thousands of users and tens of millions of queries daily2. I expected it would give me much more trouble than it has. It's caused me exactly 30mins of stress during a manual migration and that's all. Aside from that it's been fast, stable, and much cheaper.
I sleep just fine at night thank you.
Blue skies and the white cloud
Let me rewind for a second. The "database as a service" narrative wasn't always the dominant one. From the 80s to the early 2000s, everyone ran their own databases because there wasn't really an alternative. You had your application server and your database server, often on the same physical machine. It was pretty dang fast3 because it was communicating over localhost before forwarding the final payload over the network.
Amazon launched RDS in 2009. The pitch was compelling: we'll handle backups, patching, high availability, and monitoring. You just connect and go. The early pricing was reasonable too - a small RDS instance cost about the same as a dedicated server, but with less operational overhead. If you had to scale your database specs independent from your web service, it made some sense.
The real shift happened around 2015 when cloud adoption accelerated. Companies started to view any infrastructure management as "undifferentiated heavy lifting"4. Running your own database became associated with legacy thinking. The new orthodoxy emerged: focus on your application logic, let AWS handle the infrastructure.
Fast forward to 2025 and I hope the pendulum might be swinging back. RDS pricing has grown considerably more aggressive. A db.r6g.xlarge instance (4 vCPUs, 32GB RAM) now costs $328/month before you add storage, backups, or multi-AZ deployment. For that price, you could rent a dedicated server with 32 cores and 256GB of RAM.
Unpacking the cloud
For the most part managed database services aren't running some magical proprietary technology. They're just running the same open-source Postgres you can download with some operational tooling wrapped around it.
Take AWS RDS. Under the hood, it's:
- Standard Postgres compiled with some AWS-specific monitoring hooks
- A custom backup system using EBS snapshots
- Automated configuration management via Chef/Puppet/Ansible
- Load balancers and connection pooling (PgBouncer)
- Monitoring integration with CloudWatch
- Automated failover scripting
None of this is technically complex. The value proposition is operational: they handle the monitoring, alerting, backup verification, and incident response. It's also a production ready configuration at minute zero of your first deployment. But the actual database engine? It's the same Postgres running the same SQL queries with the same performance characteristics.
I helped proved this to myself when I migrated off RDS. I took a pg_dump
of my RDS instance, restored it to a self-hosted server with identical specs, and ran my application's test suite. Performance was identical. In some cases, it was actually better because I could tune parameters that RDS locks down.
My self-hosting journey
I spent a weekend migrating to a dedicated server from DigitalOcean5: 16 vCPU / 32GB Memory / 400GB disk. The migration took about 4 hours of actual work:
- Hour 1: Provision server, install Postgres, configure basic settings (kamal makes this pretty easy)
- Hour 2: Set up monitoring (Prometheus + Grafana), configure backups
- Hour 3: Migrate data using
pg_dump
andpg_restore
- Hour 4: Update application connection strings, verify everything works
The performance improvement was immediate. Query latency dropped by about 20% across the board. Why? Because I could tune the configuration for my specific workload instead of using RDS's conservative defaults.
The real operational complexity
For my companies with high availability requirements, this stack takes me about half an hour per month. For the stacks that get less traffic I'm fully hands off - set it and forget it. This is roughly my cadence for my primary deploys:
Weekly tasks (10 minutes):
- Check backup verification (automated, just reviewing alerts)
- Review slow query logs
- Check disk space trends
Monthly tasks (30 minutes):
- Apply Postgres security updates
- Review and rotate backup retention
- Capacity planning based on growth trends
Quarterly tasks (optional) (2 hours):
- Update monitoring dashboards
- Review and optimize configuration parameters
- Test disaster recovery procedures
As far as I'm concerned this is roughly comparable to the time you spend debugging RDS connection limits, working around parameter groups you can't modify, or dealing with surprise maintenance windows.
The main operational difference is that you're responsible for incident response. If your database goes down at 3 AM, you need to fix it. But here's the thing: RDS goes down too6. And when it does, you're still the one getting paged, you just have fewer tools to fix the problem.
For the most part I've found that unless I'm actively messing with the database, it's really stable. After all you're just renting some remote machine in a data center somewhere. All updates are up to you - so you have a good idea when to schedule the most risky windows.
When self-hosting makes sense
I'd argue self-hosting is the right choice for basically everyone, with the few exceptions at both ends of the extreme:
- If you're just starting out in software & want to get something working quickly with vibe coding, it's easier to treat Postgres as just another remote API that you can call from your single deployed app
- If you're a really big company and are reaching the scale where you need trained database engineers to just work on your stack, you might get economies of scale by just outsourcing that work to a cloud company that has guaranteed talent in that area. The second full freight salaries come into play, outsourcing looks a bit cheaper.
- Regulated workloads (PCI-DSS, FedRAMP, HIPAA, etc.) sometimes require a managed platform with signed BAAs or explicit compliance attestations.
The configuration deep dive
The things you really have to bear in mind when self-hosting:
Memory Configuration: This is where most people mess up. Pulling the standard postgres
docker image won't cut it. You have to configure memory bounds with static limits that correspond to hardware. I've automated some of these configurations. But whether you do it manually or use some auto-config, tweaking these params is a must.
The key parameters:
shared_buffers
: Start around 25 % of RAM; modern PG happily uses tens of GB.effective_cache_size
: Set to 75% of system RAM (this tells Postgres how much memory the OS will use for caching)work_mem
: Be conservative here. Set it to total RAM / max_connections / 2, or use a fixed value like 32MBmaintenance_work_mem
: Can be generous (1-2GB), only used during VACUUM and index operations
Connection Management: RDS enforces their own max connections, but when self hosting you get the opportunity to choose your own:
# Connection settings
max_connections = 200
shared_preload_libraries = 'pg_stat_statements'
log_connections = on
log_disconnections = on
Wahoo! More connections = more parallelism right?
No such free lunch I'm afraid. Making fresh connections in postgres has pretty expensive overhead, so you almost always want to put a load balancer on front of it. I'm using pgbouncer on all my projects by default - even when load might not call for it. Python asyncio applications just work better with a centralized connection pooler.
And yes, I've automated some of the config there too.
Storage Tuning: NVMe SSDs change the performance equation completely:
# Storage optimization for NVMe
random_page_cost = 1.1 # Down from default 4.0
seq_page_cost = 1.0 # Keep at default
effective_io_concurrency = 200 # Up from default 1
These settings tell Postgres that random reads are almost as fast as sequential reads on NVMe drives, which dramatically improves query planning.
WAL Configuration: Write-Ahead Logging is critical for durability and performance:
# WAL settings
wal_level = replica # Enable streaming replication
max_wal_size = 2GB # Allow larger checkpoints
min_wal_size = 1GB # Prevent excessive recycling
checkpoint_completion_target = 0.9 # Spread checkpoint I/O over 90% of interval
Conclusion
I'm not advocating that everyone should self-host everything. But the pendulum has swung too far toward managed services. There's a large sweet spot where self-hosting makes perfect sense, and more teams should seriously consider it.
Start small. If you're paying more than $200/month for RDS, spin up a test server and migrate a non-critical database. You might be surprised by how straightforward it is.
The future of infrastructure is almost certainly more hybrid than it's been recently: managed services where they add genuine value, self-hosted where they're just expensive abstractions. Postgres often falls into the latter category.
-
They're either just hosting a vanilla postgres instance that's tied to the deployed hardware config, or doing something opaque with edge deploys and sharding. In the latter case they near guarantee your DB will stay highly available but costs can quickly spiral out of control. ↩
-
Maybe up to billions at this point. ↩
-
Even on otherwise absolutely snail speed hardware. ↩
-
This was Jeff Bezos's favorite phrase during the early AWS days, and it stuck. ↩
-
Similar options include OVH, Hetzner dedicated instances, or even bare metal from providers like Equinix. ↩
-
AWS RDS & S3 has had several major outages over the years. The most memorable was the 2017 US-East-1 outage that took down half the internet. ↩