Hello. The issue is at same time difficult and easy to reproduce.
Easy because I can reproduce 1 time from 3-4 run
Difficult because I cant to catch start and stop the issue
Answers to your questions
1. In general disks responsive during the fstrim
2. I run trigger every week but ih different day of week for each storage
3. I use nvme disks.
I tried to do reproduce the issue manually and found following
1. Affected only for one host of storage cluster (i think because main mount point)
2. In ovirt events found
"Host ovirt-host-03 cannot access the Storage Domain(s) glusterfs-data attached to
the Data Center Computing. Setting Host state to Non-Operational." and after that
event oVirt started migration from host-03
I think the issue is VDSM
In additional found at host ovirt-host-03 in vdsm.log few errors at time during fstrim
like
"ERROR (monitor/2ac7658) [storage.Monitor] Error checking domain
2ac76580-2182-470d-b886-d3d2e28d05b3 (monitor:453)"
2ac... - uuid gluster domain
I have 6 hosts in cluster with same install and affected only one