Eh, I will have to find my notes on the issue with the pihole, I can see if I can dig them out this weekend and send it to you (I wonder if you can send PM in Lemmy ^^).
To stay on the point of this discussion: just, and I am not joking, this afternoon I got hit by this: https://longhorn.io/kb/troubleshooting-volume-with-multipath/
The pod (in this case wireguard) was crashing because it could not mount the drive and the error was something like "already mounted or mount point busy". I had to dig and dig but I found out the problem was the one above and I fixed it. I will now add that setting in my ansible and configure all three the PIs.
However this should not happen for a mature-ish system like longhorn which may cater a userbase which may not know enough to dig into /dev .
I think there should be a better way to alert the users for such an issue. Just to be clear, longorn UI and logs were nice and dandy, all good on the western front, but all was broken.
Longorn reconciler could have a check that is something should be mounted, and is not, and the error is "already mounted", but is not "already mounted", check for known bugs.
However I think the issue is what I said above. It is too fragmented and working with a miriad of other microservices, so longhorn is like "I gave the order, now whatever".
I will share what is in my longhorn-system ns, there is no secret in here but I want to give an idea (ps: I do nothing fancy with longhorn at home - obvs some are ds so you see 3 pods because I have 3 nodes):
k get pods -n longhorn-system | cut -d' ' -f1
NAME
engine-image-ei-f9e7c473-5pdjx
engine-image-ei-f9e7c473-xq4hn
instance-manager-e-fa08a5ebf4663f1e9fb894f865362d65
engine-image-ei-f9e7c473-gdp6n
instance-manager-e-567b6ba176274fe20a001eec63ce3564
instance-manager-r-567b6ba176274fe20a001eec63ce3564
instance-manager-r-b1d285dd9205d1ba992836073c48db8a
instance-manager-e-b1d285dd9205d1ba992836073c48db8a
daily-keep-for-a-week-28144800-pppw8
longhorn-manager-xqwld
longhorn-ui-f574474c8-n847h
longhorn-manager-cgqvm
longhorn-driver-deployer-6c7bd5bd9b-8skh4
longhorn-manager-tjzvz
instance-manager-d3c9343a8637e4ef197ad6da68b3ed2d
instance-manager-cf746b18d51f6426b74d6c6652f01afc
engine-image-ei-d911131c-wwfwz
engine-image-ei-d911131c-qcn26
instance-manager-e7d92f3ca0455cde2158bebdbb33ea16
engine-image-ei-d911131c-mgb2k
csi-attacher-785fd6545b-bn9lp
csi-attacher-785fd6545b-4nfxz
csi-provisioner-8658f9bd9c-2bq7v
csi-provisioner-8658f9bd9c-q6ctq
csi-attacher-785fd6545b-rx7r9
csi-resizer-68c4c75bf5-tmw2f
csi-resizer-68c4c75bf5-n9dxm
csi-snapshotter-7c466dd68f-7r2x6
csi-snapshotter-7c466dd68f-cd8pm
longhorn-csi-plugin-vgqh5
longhorn-csi-plugin-mnskk
csi-provisioner-8658f9bd9c-kcb8f
csi-resizer-68c4c75bf5-gccfg
csi-snapshotter-7c466dd68f-wsltq
longhorn-csi-plugin-9q9kj
Dependency on the csi-* ecosystem sort of allows the errors to get lost in translation.