Proxmox Storage — Live Snapshot¶
Snapshot captured: 2026-06-24 (read-only SSH audit of prox / 192.168.6.71)
Host: prox — ASUS PN64, 1 TB NVMe
Context: Post–Wave A/B decommission; metrimon vzdump failed twice on local
Point-in-time snapshot
This document is a frozen audit of storage at capture time. It is not regenerated from inventory. After intentional changes (ISO prune, CIFS fix, guest resize, Synology cleanup), capture a new snapshot or update the compute live index with a dated successor file.
Collection method: ssh infra-services-cursor → patch-controller key →
root@192.168.6.71 — pvesm status, df, vgs/lvs, qm list, pct list,
pct config, /var/lib/vz sizing, mount write tests.
Related: Synology capacity ntfy · Compute disposition review · Prox storage remediation proposal · prox-2026-06-23.json
Post-snapshot update (2026-06-24)¶
Backup target replaced. Owner created NFS share infra-backups on Whrrr
volume6; agent configured Proxmox and validated end-to-end.
| Item | Value |
|---|---|
| Storage ID | infra-backups |
| Export | 192.168.6.215:/volume6/infra-backups |
| Mount | /mnt/pve/infra-backups |
| NFS rule | 192.168.6.71 Read/Write, squash Map root to admin |
| Quota (DSM) | 2 TB visible to PVE |
| Retention | prune-backups keep-last=3 |
| Retired | vm-backups CIFS on Prawns (pvesm remove) |
Validation: vzdump 120 --storage infra-backups succeeded (octoprint LXC,
479 MB archive on NFS).
Relocated (2026-06-24): ~2.7 GB legacy Wave A vzdump (103 unmanic, 117 caddy,
122 netboot) plus metrimon failure log and 2024 saltierpoop log moved from prox
local to infra-backups/dump/ (~3.1 GB total on NFS). ISO prune unchanged.
Executive summary¶
| Pressure | Severity | Headline |
|---|---|---|
pve-root / vzdump scratch |
Critical | 24 GB free on 96 GB root; large vzdump to local needs more scratch than available |
vm-backups CIFS |
Critical | Mounted but not writable — PVE marks storage inactive |
| Synology Prawns ~99% | High | ~130 GB free on 30 TB; caps NFS + backup offload |
local-lvm thin pool |
Moderate | 67.5% used (~562 GB / 794 GB); room after decom waves |
| LXC 114 nfs-monitoring | High (guest) | 91% full inside (96 GB / 111 GB) |
Two separate pools matter: local (pve-root) holds ISOs and vzdump files;
local-lvm holds guest disks. Freeing thin-pool space does not fix root
pressure during backup.
Physical and logical layout¶
flowchart TB
subgraph nvme["NVMe ~930 GB (VG pve)"]
root["pve-root 96 GB<br/>local — ISO, vzdump, vztmpl<br/>66 GB used · 24 GB free"]
swap["swap 8 GB"]
thin["local-lvm thin pool 794 GB<br/>67.5% data used<br/>~562 GB used · ~271 GB free"]
vgfree["VG unallocated ~8 GB"]
end
subgraph nas["Synology Whrrr 192.168.6.215 — Prawns ~30 TB"]
prawns["99.6% full · ~130 GB free"]
cifs["CIFS //Prawns/backups/proxbox<br/>vm-backups — WRITE FAIL"]
nfs["NFS /volume9/Prawns<br/>synorpn — LXC bind mounts"]
end
root -->|"vzdump default when CIFS broken"| dump["/var/lib/vz/dump 2.7 GB"]
root --> iso["/var/lib/vz/template/iso 12 GB"]
root --> cache["/var/lib/vz/template/cache 2.3 GB"]
thin --> guests["12 guests — see §4"]
nfs --> lxc114["LXC 114 mp0/mp1"]
cifs -.->|"Permission denied"| root
pve-root breakdown¶
| Mount / path | Size | Role |
|---|---|---|
/ (pve-root) |
96 GB total | Proxmox OS + local storage |
/var/lib/vz/template/iso |
12 GB | VM install ISOs |
/var/lib/vz/template/cache |
2.3 GB | LXC templates |
/var/lib/vz/dump |
2.7 GB | vzdump archives (local backup target) |
Other /var, /usr, … |
~49 GB | packages, logs, PVE state |
Proxmox storage targets¶
| Storage | Type | PVE status | Total | Used | Avail | Content | Notes |
|---|---|---|---|---|---|---|---|
| local | dir | active | 94 GB | 65 GB | 23 GB | iso, backup, vztmpl | Backs onto pve-root |
| local-lvm | lvmthin | active | 795 GB | 537 GB | 264 GB | images, rootdir | data thin pool |
| synorpn | nfs | active | 30 TB | 29.9 TB | ~130 GB | rootdir | 192.168.6.215:/volume9/Prawns |
| vm-backups | cifs | inactive | 30 TB | 29.9 TB | ~130 GB | backup | touch → Permission denied |
storage.cfg excerpt (paths only):
- local:
/var/lib/vz - local-lvm: thin pool
data, VGpve - synorpn: export
/volume9/Prawns→/mnt/pve/synorpn - vm-backups:
//192.168.6.215/Prawnssubdir/backups/proxbox, userproxbox
Why vzdump to local failed (metrimon VM 106)¶
flowchart TD
A["vzdump requested"] --> B{"--storage?"}
B -->|"vm-backups"| C["CIFS: Permission denied"]
B -->|"local (default/fallback)"| D["Stream to /var/lib/vz/dump on pve-root"]
C --> D
D --> E{"Free space on 96 GB root?"}
E -->|"Large VM e.g. 96 GB disk"| F["Fails ~77% — broken pipe / disk full"]
E -->|"Small LXC"| G["Succeeds — e.g. 240 MB–1.5 GB archives"]
Metrimon (96 GB provisioned) failed twice at ~77% of read with ~24 GB free on root. Compressed archive would have been smaller, but the writer needs substantial temporary headroom during the job.
ISO and template inventory (local)¶
ISOs — 12 GB total¶
| File | Size | Candidate to remove? |
|---|---|---|
ubuntu-22.04.2-desktop-amd64.iso |
4.6 GB | Yes (desktop; saltierpoop uses server ISO) |
ubuntu-24.04.1-live-server-amd64.iso |
2.6 GB | Keep one server ISO |
ubuntu-22.04.4-live-server-amd64.iso |
2.0 GB | Dedupe vs 24.04 |
ubuntu-22.04.2-live-server-amd64.iso |
1.9 GB | Dedupe vs 24.04 |
alpine-standard-3.21.0-x86_64.iso |
241 MB | Keep if needed for tiny installs |
Quick win: remove redundant Ubuntu ISOs → ~10 GB back on root.
CT templates — 2.3 GB¶
Largest: noble-server-cloudimg-amd64.img (601 MB), TurnKey/Debian/Ubuntu tarballs.
Prune unused templates after confirming no planned LXC creates.
Surviving vzdump on local — 2.7 GB¶
| Guest (retired) | Archive | Size |
|---|---|---|
| caddy (117) | vzdump-lxc-117-2026_06_23-18_47_55.tar.zst |
1.5 GB |
| unmanic (103) | vzdump-lxc-103-2026_06_23-18_37_39.tar.zst |
976 MB |
| netboot.xyz (122) | vzdump-lxc-122-2026_06_23-18_32_33.tar.zst |
240 MB |
Note (2026-06-24): Three other Wave A dumps (k6, aiproject, dnsproject) were removed from prox during a failed metrimon backup retry — see journal incident. Paths in inventory mark those artifacts as deleted.
Thin pool (local-lvm) — provisioned vs actual¶
xychart-beta
title "Top guests by ~actual thin usage (GB, estimated)"
x-axis ["100 saltierpoop", "114 nfs-mon", "119 harbor", "110 sqlserver", "115 ollama", "200 haos", "123 infra", "109 graylog", "111 influx", "113 mysql", "116 pulse"]
y-axis "GB (approx)" 0 --> 220
bar [207, 110, 70, 40, 29, 30, 19, 14, 7, 2, 4]
Estimates: provisioned_GB × LVM_data_percent from lvs at capture time.
| LV | Guest | Prov. | Data % | ~Actual |
|---|---|---|---|---|
| vm-100-disk-0 | saltierpoop | 260 GB | 79.7% | ~207 GB |
| vm-114-disk-0 | nfs-monitoring | 112 GB | 98.7% | ~110 GB |
| vm-119-disk-0 | harbor-registry | 80 GB | 87.8% | ~70 GB |
| vm-110-disk-0 | sqlserver2022 | 60 GB | 66.7% | ~40 GB |
| vm-115-disk-0 | ollama | 35 GB | 82.7% | ~29 GB |
| vm-200-disk-1 | haos | 32 GB | 95.3% | ~30 GB |
| vm-123-disk-1 | infra-services | 30 GB | 65.0% | ~19 GB |
| vm-109-disk-0 | graylog | 30 GB | 48.2% | ~14 GB |
| vm-111-disk-0 | influxdb | 8 GB | 86.7% | ~7 GB |
| vm-113-disk-0 | mysql | 8 GB | 25.2% | ~2 GB |
| vm-116-disk-0 | pulse | 4 GB | 99.5%* | ~4 GB |
* Pulse LVM thin metadata high; in-guest df showed 59% used (2.2 GB / 3.9 GB).
Pool totals: 794 GB thin, 67.5% data used, 2.22% metadata — ~271 GB thin free for growth.
Guests at snapshot (12 on prox)¶
Post–Wave A/B decommission. Destroyed since prior baseline: 102, 103, 105, 106, 107, 117, 118, 121, 122.
VMs¶
| VMID | Name | State | RAM | Disk prov. | Storage | Notes |
|---|---|---|---|---|---|---|
| 100 | saltierpoop | running | 30 GB | 260 GB | local-lvm | Largest consumer; virtio-scsi, discard=on |
| 123 | infra-services | running | 8 GB | 30 GB | local-lvm | Homelab control plane |
| 200 | haos | running | 6 GB | 32 GB | local-lvm | Thin 95%; review HA retention |
LXCs¶
| VMID | Name | State | RAM | Disk prov. | In-guest / |
Mounts | Notes |
|---|---|---|---|---|---|---|---|
| 109 | graylog | stopped | 8 GB | 30 GB | — | — | Pattern E revive candidate |
| 110 | sqlserver2022 | stopped | 23 GB | 60 GB | — | — | Owner keep |
| 111 | influxdb | running | 2 GB | 8 GB | — | — | Retire after Influx cutover |
| 113 | mysql | stopped | 1 GB | 8 GB | — | — | Owner keep |
| 114 | nfs-monitoring | running | 8 GB | 112 GB | 96G/111G (91%) | synorpn, prawns NFS | ES + Docker; hottest guest |
| 115 | ollama | stopped | 10 GB | 35 GB | — | — | Phase 9 keep |
| 116 | pulse | running | 1 GB | 4 GB | 2.2G/3.9G | — | Pulse Monitoring Server |
| 119 | harbor-registry | running | 4 GB | 80 GB | 45G/79G (61%) | — | Harbor v2.14 stack |
| 120 | octoprint | stopped | 1 GB | 4 GB | — | — | Owner keep |
LXC 114 mount detail (capture)¶
rootfs: local-lvm:vm-114-disk-0,size=112G
mp0: /mnt/pve/synorpn,mp=/mnt/synorpn
mp1: /mnt/prawns,mp=/mnt/nfs-prawns
nameserver: 192.168.6.17 (fixed 2026-06-24; was Tailscale DNS)
Inside 114 at capture: /mnt/synorpn on Whrrr NFS 100% (30T volume); rootfs
91% full (Elasticsearch + Docker stack).
Relief options (prioritized)¶
flowchart LR
subgraph immediate["Immediate — root / backups"]
A1["Fix vm-backups CIFS ACLs"]
A2["Prune ISOs ~10 GB"]
A3["Relocate vzdump to infra-backups ✓"]
end
subgraph upstream["Upstream — NAS"]
B1["Synology Prawns cleanup"]
B2["Expand or tier cold data"]
end
subgraph guests["Guests — thin pool"]
C1["114 ES/Docker cleanup"]
C2["100 saltierpoop retention"]
C3["200 HAOS history"]
end
A1 --> B1
A3 --> B1
B1 --> C1
| Priority | Action | Impact | Owner gate |
|---|---|---|---|
| 1 | Fix vm-backups write (Synology creds / ACL / PVE storage test) | Unblocks all future vzdump off root | CIFS password / DSM share ACL |
| 2 | Prune redundant ISOs on prox | ~10 GB on pve-root |
Confirm which ISOs to keep |
| 3 | Synology Prawns capacity | NFS + backup headroom | capacity runbook |
| 4 | Move surviving vzdump to NAS (after #1) | ~2.7 GB on root | Approve destination path |
| 5 | 114 in-guest cleanup (ES indices, Docker) | Guest + thin pressure | Ops — no destroy |
| 6 | Set vm-backups as default backup storage in PVE | Prevents repeat | After #1 verified |
Regenerate¶
No automated script yet. To refresh this snapshot:
# From operator workstation (Cursor: infra-services-cursor → prox)
ssh infra-services-cursor "sudo ssh -i /etc/homelab/patch-controller/id_ed25519 root@192.168.6.71 \
'pvesm status; df -hT; vgs; lvs -o+data_percent; du -sh /var/lib/vz/*'"
Copy output into a new dated file prox-storage-YYYY-MM-DD.md and link from
compute live index.
Changelog¶
| Date | Change |
|---|---|
| 2026-06-24 | Initial snapshot after consolidation Wave A/B and metrimon vzdump failure |
| 2026-06-24 | infra-backups NFS on volume6 live; vm-backups CIFS removed; test vzdump 120 OK |
| 2026-06-24 | Legacy ~2.7 GB vzdump relocated from prox local to infra-backups |