Infrastructure

Self-hosting Matrix in 2026: what changed, what's easier, what's still hard


We've been deploying Matrix homeservers for clients since 2024. The ecosystem has changed meaningfully in the past 18 months — some things are dramatically easier, some persistent pain points remain. Here's an honest assessment.

What got better

Dendrite is now viable for production. For homeservers under ~500 users, Dendrite is our default recommendation. It's dramatically lighter than Synapse, federates correctly, and the performance characteristics are predictable. We've run it in production for 8+ months without issues.

Element X is genuinely good. The rewrite of Element using Rust SDK is a substantial improvement. Message threading works, the notification UX is significantly better, and it doesn't feel like a prototype anymore. We now recommend it for all new deployments.

Mautrix bridges are more stable. The WhatsApp and Telegram bridges in particular have improved significantly. We used to budget for 30% bridge outage time; in recent deployments we're seeing 98%+ uptime.

What's still hard

Federation is still operationally complex. Getting federation right requires careful DNS configuration, valid TLS on the federation port, and understanding the federation state resolution algorithm. We still see clients who've been "running Matrix for years" with broken federation they didn't know about.

Media storage is a long-term operational burden. Matrix's approach to media — where your server permanently stores any media from federated rooms — creates storage growth that's hard to bound. Plan your S3 lifecycle policies carefully from day one.

Database maintenance needs attention. Synapse in particular accumulates state that needs periodic pruning. We run automated state purging jobs and media cleanup as part of all Studio deployments.

Our current stack

HomeserverDendrite (<500 users) / Synapse (larger)
ClientElement X + Element Web
BridgesMautrix (WhatsApp, Telegram, Signal, Slack)
Media storageMinIO (S3-compatible, self-hosted)
AuthKeycloak via OIDC / LDAP bridge
DeploymentKubernetes + Argo CD

Self-hosting Matrix is absolutely worth it for organizations that need communications sovereignty. The operational burden has decreased significantly. With the right setup and automated maintenance, expect 2–4 hours/month of operational attention after the initial deployment settles.

Performance sizing: what hardware do you actually need?

The most common question we get from people considering self-hosting Matrix is: "what server do I need?" The honest answer is: much less than you think, but the wrong choice in one dimension will cause you pain disproportionate to its cost.

For 50 users with moderate traffic (meetings, team channels, some file sharing), a VPS with 4 vCPUs and 8GB RAM is comfortable. The bottleneck is almost never CPU — it's disk I/O and, for federated servers, the state resolution algorithm. Synapse's state resolution can become genuinely expensive when you join large federated rooms (Matrix.org's main rooms have millions of room state events). We've seen 2-core VPSes spend 80% of CPU time just doing state resolution for a single large public room join.

Our recommendation for production: dedicated storage (avoid NVMe-less VMs), PostgreSQL on the same machine or a dedicated DB node, and strict room member limits on public-facing servers. For purely private internal deployment with no federation to the public Matrix network, a Raspberry Pi 5 handles 20–30 concurrent users without complaint.

SIZING REFERENCE — SYNAPSE
USERS
CPU
RAM
STORAGE
1–20
1–2 vCPU
2–4 GB
20 GB SSD
20–100
2–4 vCPU
4–8 GB
100 GB SSD
100–500
4–8 vCPU
8–16 GB
500 GB NVMe
500+
Consider worker mode + dedicated DB

Federation hardening: the whitelist approach

The default Synapse configuration allows federation with any Matrix server on the public internet. This is intentional — Matrix is designed as an open federated network, analogous to email. But for enterprise or sensitive deployments, open federation is a risk surface: you're accepting room events and state from servers you don't control, operated by people you don't know.

The two sensible options are:

  • Federation whitelist: explicitly list the servers you'll federate with in federation_domain_whitelist. Good for organisations where all users are on your server and a small set of known partner servers.
  • Federation disabled: set federation_domain_whitelist: [] (empty list). Your server operates as a purely private instance — no external Matrix users can interact with it at all.

We recommend the whitelist approach for most enterprise deployments. Completely disabling federation eliminates the ability to communicate with external Matrix users, which becomes inconvenient as Matrix adoption grows. A carefully maintained whitelist gives you control without total isolation.

The backup strategy you actually need

The most common failure mode we see with self-hosted Matrix servers isn't hardware failure — it's disk-full events followed by database corruption. Synapse is not forgiving when PostgreSQL runs out of disk mid-write. We have recovered enough corrupted Matrix databases to have strong opinions about backups.

The baseline: automated PostgreSQL dumps every 4 hours to a separate volume (not the same physical disk), with nightly replication to off-site storage (S3 or equivalent). Test your restoration procedure before you need it — we recommend scheduling a monthly restore drill. The 47Network Studio deployment includes ZFS snapshots of the entire data volume as an additional recovery layer, which allows point-in-time restoration even for mid-write corruption scenarios.

One thing frequently overlooked: the Synapse media store. Uploaded files, avatars, and voice messages are stored outside the database in a flat file hierarchy. This directory must be backed up separately and in coordination with the database — a database backup without the corresponding media store backup will result in broken media references on restore.

If you're running the mautrix bridge stack, each bridge has its own SQLite or PostgreSQL database that also needs backing up. We've seen bridge databases grow to 20GB+ for active WhatsApp bridges with long message history. Include them in your backup scope from day one.


← Back to Blog Work with Studio →