Phase 7 Owner Actions — Network, ACLs, and Security Tightening¶

Everything you need to do for Phase 7. Tasks are ordered by dependency — do them top-to-bottom. Each section tells you where to go, what to click, and how to verify.

Progress¶

#	Task	SEC	Status
1	Zone-Based Firewall policies	SEC-002	Done (2026-05-14)
2	Delete DSM port forwards	SEC-003	Done (2026-05-15)
3	WiFi SSID-to-VLAN mapping	SEC-005	Done (2026-05-15)
4	Confirm DSM via Tailscale	SEC-003	Done (2026-06-19 — off-LAN phone test)
5-pre	Generate + encrypt three Tailscale keys (Ansible + `secrets/tailscale/*`)	SEC-007	Done
5a-c	Deploy Tailscale (Ansible-managed hosts)	SEC-007	Done (2026-06-19 — infra-services, prox, saltierpoop)
5d-g	Deploy Tailscale (manual hosts)	SEC-007	Partial — whrrr done; ubuncap/recordurbate Ansible `tag:server`; haos optional
6	Create Tailscale API key + GitHub secrets	SEC-007	Done (2026-06-19 — run 27811413868)
7	Deploy AdGuard + Unbound	—	Done (2026-06-18)
8	Cut over DNS to AdGuard	—	Done (2026-06-18 — UDM WAN + DHCP → `192.168.6.17`)
9	Decommission PiHole (LXC 104)	—	Done (2026-06-17 — stopped, verified, `pct destroy 104 --purge`)

4. Confirm DSM Accessible via Tailscale (SEC-003)¶

Why: DSM port forwards are now deleted. The only way to reach DSM remotely is via Tailscale. Verify this works before moving on.

Steps¶

Connect to Tailscale on your phone or laptop (off home WiFi — cellular or remote network)
Open https://100.71.93.130:5001 in a browser
DSM login should appear

If it doesn't work¶

Check that the whrrr node is online: run tailscale status on any connected device and look for whrrr
If whrrr is offline, SSH to the Synology and run sudo tailscale up
If the Tailscale package isn't installed on DSM, see Section 5 below

5. Deploy Tailscale to All Hosts¶

Read this scope carefully — easy to over-read. Phase 7 does not mean “every endpoint, URL, or service in the lab magically runs through Tailscale.” It means: (1) the seven machines listed below each get a Tailscale client (or you deliberately skip one), and (2) optional LAN reachability into 192.168.6.0/24 only via subnet routing from infra-services, which is extra operator steps (approve routes, ACLs, --accept-routes on clients — see below). Other VLANs, WAN-only names (Cloudflare), and random LXCs are out of scope unless you add routes or install Tailscale there too.

Tailscale deployment is split into two tracks:

Ansible-managed hosts (infra-services, prox, saltierpoop): Fully automated via the tailscale Ansible role. Tags, routes, IP forwarding, and UFW rules are all driven from inventory YAML.
Non-Ansible hosts (whrrr DSM, haos): Manual install. Synology uses a DSM package; HAOS uses a first-party add-on.
Whrrr VMM guests (ubuncap, recordurbate): Ansible-managed like other Linux hosts — tag:server via the tailscale role (not tag:customer-app).

Deployment checklist¶

#	Host	Type	Managed by	Tag	Status
5a	infra-services	VM (Ubuntu)	Ansible	`tag:server`	Done — subnet router `192.168.6.0/24`
5b	prox	Bare metal (Debian/PVE)	Ansible	`tag:server`	Done — `proxbox-cube` (`100.97.134.65`); root-only bootstrap
5c	saltierpoop	VM (Ubuntu)	Ansible	`tag:server`	Done
5d	whrrr	Synology RS2421+	Manual	`tag:nas`	Done — DSM package
5e	haos	VM (HAOS)	Manual	`tag:server`	Optional — not deployed
5f	recordurbate	VM (Linux)	Ansible	`tag:server`	Done — Whrrr VMM guest
5g	ubuncap	VM (Linux)	Ansible	`tag:server`	Done — Whrrr VMM guest

Subnet routing (optional path to some LAN targets, not “all endpoints”): When infra-services is up as a subnet router, it can advertise 192.168.6.0/24 so tailnet clients may reach IP addresses on that VLAN (LXCs, bare services) without installing Tailscale on each box. That path only works after you approve the route in the admin console, ensure ACLs allow the traffic, and turn on subnet route acceptance on every client device you use (see “Post-deployment: accept routes” later). It does not give you Tailscale HTTPS names for every internal service, does not cover other subnets/VLANs, and does not replace manual Tailscale on the four non-Ansible hosts above.

5-pre. Tailscale auth keys (three-key split)¶

Use separate reusable keys so the automation key never carries tag:nas or tag:customer-app. Full operator checklist (when to create each key, encrypt, commit, decrypt one-liners): secrets/tailscale/README.md.

5-pre-a. Ansible key (`tag:server` only)¶

The tailscale Ansible role reads infra/ansible/inventory/group_vars/tailscale/tailscale.sops.yaml. The key must be pre-authorized for only tag:server (covers infra-services, prox, saltierpoop).

Go to https://login.tailscale.com/admin/settings/keys
Click Generate auth key
Set: Reusable = yes, Pre-authorized = yes
Under Tags, add only tag:server
Copy the key (starts with tskey-auth-...)

On infra-services, encrypt it into the Ansible group vars:

ssh someone@192.168.6.17
cd /opt/homelab

# Write plaintext at the target path (SOPS matches creation rules by path)
cat > infra/ansible/inventory/group_vars/tailscale/tailscale.sops.yaml <<'EOF'
---
tailscale_auth_key: "tskey-auth-PASTE_YOUR_KEY_HERE"
EOF

# Encrypt in-place
SOPS_AGE_KEY_FILE=/etc/homelab/age-key.txt \
  sops -e -i infra/ansible/inventory/group_vars/tailscale/tailscale.sops.yaml

# Commit and push
git add infra/ansible/inventory/group_vars/tailscale/tailscale.sops.yaml
git commit -m "chore: add SOPS-encrypted Tailscale auth key"
git push

5-pre-b. NAS and customer-app keys (`tag:nas` / `tag:customer-app`)¶

Manual installs use secrets/tailscale/nas.sops.yaml and secrets/tailscale/customer-app.sops.yaml (see examples in that directory). Create one Tailscale key per file (each key pre-authorized for one tag family only), encrypt with SOPS, commit ciphertext, then use §5d / §5f–5g.

Do not store those keys in tailscale.sops.yaml and do not reuse the Ansible key on Synology or customer-app VMs.

5a–5c. Ansible-managed hosts (infra-services, prox, saltierpoop)¶

These three hosts are in the tailscale Ansible group. The role handles:

Installing Tailscale via official apt repo
Enabling IP forwarding on subnet routers (infra-services)
Running tailscale up with the correct tags and routes
Opening UFW port 41641/udp and allowing the tailscale0 interface

If ansible-pull dies on Authenticate Tailscale with auth key (Ansible hides output with no_log): on the host run sudo journalctl -u tailscaled -n 40 --no-pager. A common message is requested tags [tag:…] are invalid or not permitted — the key used on that host was not pre-authorized for the tag passed to tailscale up.

Ansible hosts: the key in tailscale.sops.yaml must allow only tag:server (see §5-pre-a).
whrrr: use the nas key from secrets/tailscale/nas.sops.yaml (§5-pre-b).
recordurbate / ubuncap: use secrets/tailscale/customer-app.sops.yaml.

Fix keys in the Tailscale admin keys UI, update the right SOPS file, commit, push, and retry.

Option A — Wait for ansible-pull (automatic, 30-minute cycle):

The auth key commit above will be picked up on the next ansible-pull cycle. Check status after ~30 minutes:

ssh someone@192.168.6.17
tailscale status

Option B — Run manually now (recommended for first deployment):

ansible-pull keeps its own clone under /var/lib/ansible-pull/homelab (common_pull_workdir in Ansible). That directory is only a Git repo after ansible-pull-apply has run successfully at least once. If cd /var/lib/ansible-pull/homelab && git status says not a git repository, either the timer has never completed a pull, or the workdir was never populated — check:

sudo systemctl status ansible-pull-apply.service
sudo journalctl -u ansible-pull-apply.service -n 50 --no-pager
ls -la /var/lib/ansible-pull/homelab

You can kick a run once: sudo systemctl start ansible-pull-apply.service then re-check for .git.

Two different Git trees: ansible-pull always runs from /var/lib/ansible-pull/homelab. A git pull in /opt/homelab only updates your operator clone. If main on GitHub already contains a fix but ansible-pull logs still show an old failure, the pull workdir is probably behind — reconcile it explicitly (as root, same SSH key rules as the unit), then re-enable the timer if you stopped it:

sudo git -C /var/lib/ansible-pull/homelab fetch origin
sudo git -C /var/lib/ansible-pull/homelab status
sudo git -C /var/lib/ansible-pull/homelab merge --ff-only origin/main
sudo systemctl start ansible-pull-apply.timer
sudo systemctl start ansible-pull-apply.service
sudo journalctl -u ansible-pull-apply.service -n 80 --no-pager

Use sudo journalctl … for unit output (otherwise you may only see your own user messages).

If you already maintain /opt/homelab as a normal clone (manual git pull / pushes), you can run the playbook from there instead — paths are the same relative to repo root:

ssh someone@192.168.6.17
cd /opt/homelab/infra/ansible   # ansible.cfg + roles_path live here
git pull   # your usual clone; deploy key for fetch

# When you run ON infra-services, use local connection (no SSH loopback to .17).
SOPS_AGE_KEY_FILE=/etc/homelab/age-key.txt \
ansible-playbook \
  -i inventory/generated.yml \
  playbooks/site.yml \
  --tags tailscale \
  --limit infra-services \
  -e ansible_connection=local

Same paths from repo root if you prefer to stay in /opt/homelab (add -e ansible_connection=local when running on infra-services itself):

export ANSIBLE_CONFIG=/opt/homelab/infra/ansible/ansible.cfg
SOPS_AGE_KEY_FILE=/etc/homelab/age-key.txt \
ansible-playbook \
  -i infra/ansible/inventory/generated.yml \
  infra/ansible/playbooks/site.yml \
  --tags tailscale \
  --limit infra-services \
  -e ansible_connection=local

If you use the ansible-pull workdir (and it is a valid clone):

ssh someone@192.168.6.17
cd /var/lib/ansible-pull/homelab/infra/ansible
git pull   # uses deploy key; must fast-forward / rebase per that clone’s config

SOPS_AGE_KEY_FILE=/etc/homelab/age-key.txt \
ansible-playbook \
  -i inventory/generated.yml \
  playbooks/site.yml \
  --tags tailscale \
  --limit infra-services \
  -e ansible_connection=local

Shared checkout (operators + root): The common role creates UNIX group homelab-pull, adds common_github_ssh_users to it, sets the workdir to root:homelab-pull mode 2770 (setgid), normalizes the clone once, sets git config core.sharedRepository group, and adds UMask=0002 to the ansible-pull systemd units. After one successful apply, log out and back in (or newgrp homelab-pull) so your session has the group; then cd /var/lib/ansible-pull/homelab && git pull as someone needs no sudo and no safe.directory workaround.

Repeat with --limit prox and --limit saltierpoop (from a control node that can SSH to those hosts, or run on each host from its own clone if you set one up there).

Connection note: --limit infra-services uses ansible_host from inventory (e.g. 192.168.6.17). Running the playbook on infra-services still uses SSH to that address; ensure login as someone works to the VM’s own IP (loopback path is fine).

If ansible-pull-apply fails with fatal: could not read Username for 'https://github.com': No such device or address:

The systemd unit is cloning over HTTPS. Unattended pulls must use the SSH URL (git@github.com:spadoople/homelab.git) so ~/.ssh/config can use the read-only deploy key (no interactive username).

Fix both units, reload, retry:

sudo sed -i 's#https://github.com/spadoople/homelab.git#git@github.com:spadoople/homelab.git#g' \
  /etc/systemd/system/ansible-pull-apply.service \
  /etc/systemd/system/ansible-pull-check.service
sudo systemctl daemon-reload
sudo systemctl start ansible-pull-apply.service
sudo journalctl -u ansible-pull-apply.service -n 30 --no-pager

If /var/lib/ansible-pull/homelab/.git already exists, point origin at SSH as well:

cd /var/lib/ansible-pull/homelab
git remote set-url origin git@github.com:spadoople/homelab.git

Prefer re-applying the common role from /opt/homelab so units match repo templates: ansible-playbook … --tags ansible-pull (after units use SSH, timers stay correct).

If sudo systemctl start ansible-pull-apply “hangs” (no prompt for many minutes):

ansible-pull-apply is Type=oneshot: systemctl start blocks until the entire playbook finishes. A full site.yml run can easily take 10–25+ minutes (apt, downloads, multiple roles). That is normal, not a frozen shell.

In another SSH session, follow logs live:

sudo journalctl -fu ansible-pull-apply.service

When it completes, systemctl start returns and systemctl status shows inactive (dead) with a result.

If the journal stops on tailscale up for a long time, the Tailscale role now wraps that in a timeout so a bad/missing auth key cannot block forever (pull latest repo before relying on that).

After each host registers:

Go to https://login.tailscale.com/admin/machines
For infra-services: Click three-dot menu > Edit route settings > Approve the 192.168.6.0/24 subnet route
Verify: tailscale status on each host should show "Connected"

What the role configures per host:

Host	Tags	Routes	IP forwarding	UFW
infra-services	`tag:server`	`192.168.6.0/24`	Yes (sysctl)	41641/udp + tailscale0
prox	`tag:server`	—	No	Skipped (pve-firewall)
saltierpoop	`tag:server`	—	No	41641/udp + tailscale0

5d. whrrr (Synology DSM) — Manual¶

Prerequisite: secrets/tailscale/nas.sops.yaml exists in git (encrypted) per secrets/tailscale/README.md and §5-pre-b.

Tailscale on Synology runs as a DSM package. Upstream documents Synology limitations (no tailscale up --accept-routes, no exit-node client behavior on the NAS itself): tailscale/tailscale#1995. whrrr can still join the tailnet with tag:nas and you can reach DSM at the Synology Tailscale IP; you cannot expect whrrr to follow subnet routes advertised by other nodes.

Open DSM > Package Center > verify Tailscale is installed and running
From a host that has this repo and sops (usually your laptop), run the NAS one-liner in secrets/tailscale/README.md (it SSHs to Synology and passes --authkey=…). If you cannot run sops from that path, decrypt locally, then paste the key into sudo tailscale up --authkey=… in an SSH session to the NAS (avoid leaving the key in shell history where possible).

If tailscale up prints “Some peers are advertising routes but --accept-routes is false”, that is expected on Synology: DSM cannot enable --accept-routes (Tailscale #1995). It is a notice, not a failed login — your NAS is still on the tailnet. Ignore it unless you need whrrr itself to reach LAN-only targets via another node’s advertised subnets (unsupported on DSM).

If the CLI isn't accessible, apply the tag:nas tag from the Tailscale admin console:

Go to https://login.tailscale.com/admin/machines
Find whrrr
Click three-dot menu > Edit ACL tags > add tag:nas

5e. haos (Home Assistant OS) — Manual (HA add-on)¶

HAOS has a first-party Tailscale add-on. It does not read tailscale.sops.yaml or secrets/tailscale/*.yaml; authenticate with the URL from the add-on log unless you configure a key there separately. If you ever need a reusable key for HA, create another tag:server-only key and store it in 1Password — do not reuse the Ansible automation key outside ansible-pull.

Open Home Assistant > Settings > Add-ons > Add-on Store
Search for Tailscale and install it
Go to the add-on Configuration tab and set:

tags:
  - tag:server
accept_routes: true

Then start the add-on. Check the add-on Log tab for the auth URL — open it to authenticate.

After install: Home Assistant at http://<tailscale-ip>:8123 works from anywhere on the tailnet.

5f–5g. recordurbate and ubuncap — Ansible-managed (Whrrr VMM)¶

These VMs are in the Ansible tailscale group with tag:server (same key as infra-services, prox, saltierpoop). Converge via ansible-pull on each VM after inventory changes; one-time tag migration uses tailscale_force_reauth in group_vars/whrrr_vmm_guests.yml.

Customer-app containers (recordurbate-tiktok) on ubuncap are not Tailscale nodes — only the VM OS is managed here.

See Whrrr VMM inbound SSH for Cursor/agent access.

Post-deployment: accept routes on your devices¶

After infra-services is advertising 192.168.6.0/24, enable route acceptance on each client device you use:

macOS/Linux: tailscale up --accept-routes or toggle in the Tailscale menu bar app
iOS/Android: Tailscale app > Settings > "Use Tailscale subnets"
Windows: Tailscale system tray > "Use subnet routes"

6. Create Tailscale API Key + GitHub Secrets (SEC-007)¶

Where: Tailscale Admin Console + GitHub repo settings

Steps¶

Go to https://login.tailscale.com/admin/settings/keys
Create a new API key
Note your tailnet name (visible at the top of the admin console)
Go to your GitHub repo > Settings > Secrets and variables > Actions
Add two secrets:
TS_API_KEY — paste the API key
TS_TAILNET — paste the tailnet name (e.g., yourtailnet.ts.net)

Verify¶

Push any change to infra/tailscale/ on main, or run manually: Actions → Tailscale ACL Sync → Run workflow (workflow_dispatch)
The workflow should succeed (requires action: apply in the workflow file)
Check Tailscale admin > Access Controls — it should match acl.json

Reference: infra/tailscale/README.md

7. Deploy AdGuard + Unbound on infra-services¶

Where: SSH to infra-services

Steps¶

Pull the latest code:

cd /opt/homelab
git pull

Add a DNS record for adguard.infra.realemail.app on the UDM:
Settings > Policy Table > add adguard.infra.realemail.app → 192.168.6.17
Start the stack:

cd /opt/homelab/services/adguard
docker compose up -d

Open https://adguard.infra.realemail.app in your browser for initial setup (not :3000 — homepage uses that port on the host):
Set admin username and password
Set the listen interface to 0.0.0.0:53 for DNS
Set upstream DNS to udp://unbound with bootstrap 127.0.0.11 (see AdGuard upstream (repo README))
Import DNS rewrites from inventory (optional but recommended):

cd /opt/homelab
# The file is already generated at services/adguard/dns-rewrites.yaml
# Add them in the AdGuard UI: Filters > DNS rewrites
# Or script it with curl (see services/adguard/README.md)

Verify¶

dig @192.168.6.17 google.com should return an A record
dig @192.168.6.17 infra-services.lab.local should return 192.168.6.17 (if you imported the rewrites)
https://adguard.infra.realemail.app should load the AdGuard dashboard

Reference: services/adguard/README.md

8. Cut Over DNS from PiHole to AdGuard¶

Where: UDM SE > Settings > Networks + Settings > Internet

Prerequisite: AdGuard running and verified (step 7). infra-services must have Tailscale prefer-main routing for 192.168.6.0/24 (Ansible tailscale role — see AdGuard — Servers VLAN caveat).

Steps¶

Internet → WAN1 → DNS (IPv4 only for now):
Manual DNS 192.168.6.17
Clear any legacy PiHole IPv4 (192.168.6.80) and IPv6 entries
IPv6 DNS: leave blank until AdGuard publishes v6 (optional later)
For each VLAN network on the UDM:
Settings > Networks > click the network
Under DHCP > DNS Server → 192.168.6.17
Save

Servers VLAN (4): can use 192.168.6.17 directly now that the Tailscale routing fix is in place. Gateway-as-DNS (.1) also works if WAN upstream points at AdGuard.

Renew DHCP on clients (or reboot). On Ubuntu hosts, avoid one-off resolvectl dns — use netplan dhcp4-overrides: use-dns: false only if you need static overrides.

Verify¶

dig @192.168.6.17 google.com from a Servers VLAN host (e.g. saltierpoop)
dig google.com on that host (systemd-resolved / DHCP DNS)
Browse the web; check AdGuard dashboard for query activity
dig @192.168.6.17 infra-services.lab.local → 192.168.6.17

Parallel run¶

Keep PiHole (LXC 104) running for 48 hours after cutover. If anything breaks, revert DHCP DNS to 192.168.6.80 and WAN DNS to PiHole on the UDM.

9. Decommission PiHole LXC 104 (`blocktopus`)¶

Where: Proxmox UI or CLI

Prerequisite: 48-hour parallel run with AdGuard (step 8) verified.

Steps¶

Verify no clients are still using PiHole:
SSH to PiHole (192.168.6.80) and check query logs
If queries are still coming in, something still points at the old DNS
Stop the LXC:

ssh root@192.168.6.71  # Proxmox
pct stop 104

Wait 24-48 hours. If nothing breaks, destroy it:

pct destroy 104 --purge

Let me know when done so I can update inventory and docs.

Phase 7 Owner Actions — Network, ACLs, and Security Tightening¶

Progress¶

4. Confirm DSM Accessible via Tailscale (SEC-003)¶

Steps¶

If it doesn't work¶

5. Deploy Tailscale to All Hosts¶

Deployment checklist¶

5-pre. Tailscale auth keys (three-key split)¶

5-pre-a. Ansible key (tag:server only)¶

5-pre-b. NAS and customer-app keys (tag:nas / tag:customer-app)¶

5a–5c. Ansible-managed hosts (infra-services, prox, saltierpoop)¶

5d. whrrr (Synology DSM) — Manual¶

5e. haos (Home Assistant OS) — Manual (HA add-on)¶

5f–5g. recordurbate and ubuncap — Ansible-managed (Whrrr VMM)¶

Post-deployment: accept routes on your devices¶

6. Create Tailscale API Key + GitHub Secrets (SEC-007)¶

Steps¶

Verify¶

7. Deploy AdGuard + Unbound on infra-services¶

Steps¶

Verify¶

8. Cut Over DNS from PiHole to AdGuard¶

Steps¶

Verify¶

Parallel run¶

9. Decommission PiHole LXC 104 (blocktopus)¶

Steps¶

5-pre-a. Ansible key (`tag:server` only)¶

5-pre-b. NAS and customer-app keys (`tag:nas` / `tag:customer-app`)¶

9. Decommission PiHole LXC 104 (`blocktopus`)¶