On Tue, Jul 24, 2018 at 2:03 PM Matthew B <matthew.has.questions@gmail.com> wrote:
Hello,

I am trying to understand how I can prevent a VM from being paused when one of it's disks is unavailable due to a problem with the storage domain.

The scenario:

A VM with 3 disks. OS disk on a highly available domain. And Two large disks each on separate domains. (so a total of 3 domains).

This makes the vm depend on 3 storage domain - very fragile configuration.

The two large disks are mirrored using ZFS - but when one of the storage domains goes down the VM pauses. Is is possible to configure the VM to not pause when certain storage domains are unavailable? So instead of getting Paused due to IO error the disk would just be missing until that domain was brought back online?

I don't think we support this concept. When you start the vm, the disk must exists.

If the disks are not needed when the vm starts, maybe you want to plug them
to the vm later by your application, only if the disks are available?

Even if you solved plugging the disks only when the storage is available, you need
to handle errors while the disk is plugged to the vm.

If qemu fail to read or write to the disk, it must choose:
- pause the vm, so the I/O can be retried later when the vm is resumed
- stop the vm, maybe the operation will succeed on another host
- propagate the error the guest. This will like to make the file system
  in the guest read-only, which usually requires a restart of the vm to recover.

The last option may work if the guest is using multipath, and multipath is configured
to queue I/O forever on failures.

Vdsm supports different error policy per drive, but I don't see any UI for setting this.

You can see the error_policy in the vm xml, available in vdsm log when starting
a vm, or using virsh

    virsh -r list

find your vm id or name, and then:

    virsh -r dumpxml my-vm-name-or-id

Arik, do we have a way to control error policy per disk?

Another option is to mount the "optional" disks from within the vm,
so your application inside the vm have full control when there is trouble
to mount or access the optional disks.

Nir