
On Sat, Mar 28, 2020 at 8:26 PM Nir Soffer <nsoffer@redhat.com> wrote: [snip]
You are right ... This is just a theory based on my knowledge and it might not be valid. We nees the libvirt logs to confirm or reject the theory, but I'm convinced that is the reason.
Yet, it's quite possible. Qemu tries to write to the qcow disk on gluster. Gluster is creating shards based of the ofset, as it was not done initially (preallocated disk take the full size on gluster and all shards are created immediately). This takes time and requires to be done on all bricks. As the shard size is too small (default 64MB), gluster has to create
Hey Nir, the next shard almost immediately, but if it can't do it as fast as qemu is filling it's qcow2 disk
Gluster can block the I/O until it can write the data to a new shard. There is no reason to return an error unless a real error happened.
Also the VMs mentioned here are using raw disks, not qcow2:
[snip]
<target bus="scsi" dev="sda"/> <source
file="/rhev/data-center/mnt/glusterSD/ovirtst.mydomain.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/0a91c346-23a5-4432-8af7-ae0a28f9c208/2741af0b-27fe-4f7b-a8bc-8b34b9e31cb6"> <seclabel model="dac" relabel="no" type="none"/> </source> <driver cache="none" error_policy="stop" io="threads" name="qemu" type="raw"/>
[snip]
Note type="raw"
- qemu will get an I/O error and we know what happens there. Later gluster manages to create the shard(s) , and the VM is unpaused.
That's why the oVirt team made all gluster-based disks to be fully preallocated.
Yes, in my disk definition I used default proposed. Possibly I only chose virito-scsi (see the sda name): I don't remember in 4.3.9 and red hat core os as os type if virtio would be the default one or not...
Gluster disks are thin (raw-sparse) by default just like any other file based storage.
If this theory was correct, this would fail consistently on gluster:
1. create raw sparse image
truncate -s 100g /rhev/data-center/mnt/glusterSD/server:_path/test
2. Fill image quickly with data
dd if=/dev/zero bs=1M | tr "\0" "U" | dd of=/rhev/data-center/mnt/glusterSD/server:_path/test bs=1M count=12800 iflag=fullblock oflag=direct conv=notrunc
According to your theory gluster will fail to allocate shards fast enough and fail the I/O.
Nir
I can also try the commands above, just to see the behavior, and report here. As soon as I can connect to the system Gianluca