I'm trying to figure out how to keep a "broken" NFS mount point from causing
the entire HCI cluster to crash.
HCI is working beautifully.
Last night, I finished adding some NFS storage to the cluster - this is storage that I
don't necessarily need to be HA, and I was hoping to store some backups and
less-important VMs on, since my Gluster (sssd) storage availability is pretty limited.
But as a test, after I got everything setup, I stopped the nfs-server.
This caused the entire cluster to go down, and several VMs - that are not stored on the
NFS storage - went belly up.
Once I started the NFS server process again, HCI did what it was supposed to do, and was
able to automatically recover.
My concern is that NFS is a single point of failure, and if VMs that don't even rely
on that storage are affected if the NFS storage goes away, then I don't want anything
to do with it.
On the other hand, I'm still struggling to come up with a good way to run on-site
backups and snapshots without using up more gluster space on my (more expensive) sssd
storage.
Is there any way to setup NFS storage for a Backup Domain - as well as a Data domain (for
lesser important VMs) - such that, if the NFS server crashed, all of my non-NFS stuff
would be unaffected?
Sent with ProtonMail Secure Email.