A two-node Proxmox cluster, VMs, Docker services, and a mesh of devices connected through Twingate.
This is a two-node Proxmox VE cluster running on commodity mini-PCs. book5 is the primary hypervisor — it hosts the Twingate connector, a Pi-hole LXC, a transcript-processing LXC, and acts as the SSH jump host for every roaming device. tower is the heavier node — Nvidia M4000 passthrough to a media VM, kernel pinned to a known-good 6.17.4-2-pve after a regression in 6.17.13 caused silent hangs every 1–3 days.
VM 100 (Omarchy) is an Arch desktop used for headless GPU experiments. VM 101 (Ubuntu) is the media + AI workhorse — it runs Plex (32400), Jellyfin (8096), Ollama (11434) for local LLM inference, Frigate (5000) for two Tapo cameras on the 192.168.68.0/22 subnet, qBittorrent (8080), and a local ALPR service on 8088. A separate Pi-hole LXC (192.168.68.248) handles DNS for the lab, overriding the Twingate resolver because the Twingate DNS started failing while the client still reported online.
Everything sits on a single 192.168.68.0/22 subnet after a router migration in early 2026 collapsed the old dual-subnet setup. Both Proxmox nodes run 2.5 GbE primary interfaces on vmbr1. Tailscale provides MagicDNS across the whole tailnet — laptops, phones, the Pi1 mirror — and Twingate handles outside-network access into the lab. The two overlay networks intentionally overlap; when one breaks, the other usually still works.
A Raspberry Pi (Pi1) sits at a separate physical site running DietPi-Bookworm. It mirrors 15 git repos weekly from GitHub + Gitea, driven by a push-all skill from the workstation. The Pi has no Wi-Fi — it runs off ICS from a host PC for internet, but Tailscale reachability is host-independent, so it stays online on the tailnet even when the host is off. A small 3.5" SPI TFT renders a live status dashboard (pi1-hub, Python + rich).
tower forwards its full syslog to book5 over UDP/514 via rsyslog. After the kernel regression turned hangs into silent freezes, the box was instrumented with softlockup_panic=1, hardlockup_panic=1, and a 30-second auto-reboot on panic — so a hang now becomes a loud panic + auto-recovery instead of a remote-reboot-required outage. pstore captures panic dumps across the reboot. book5 also runs a network-health watchdog that probes internet, peers, and DNS separately and only escalates to a NetworkManager restart on true isolation.
Tap or click any node above to see its IP, role, and dependencies.