[ovirt-users] Re: owner of vm paused/unpaused operation

10 Oct 2019

      On Thu, Oct 10, 2019 at 1:10 PM Francesco Romani <fromani@redhat.com> wrote:
...
On 10/10/19 10:44 AM, Gianluca Cecchi wrote:
On Thu, Oct 10, 2019 at 9:56 AM Francesco Romani <fromani@redhat.com>
wrote:
...
The only way Vdsm will not pause the VM is if libvirt+qemu never reports
any ioerror, which is something I'm not sure is possible and that I'd never
recommend anyway.
Vdsm always tries hard to be super-careful with respect possible data
corruption.
OK.
In case of storage not accessible for a bunch of seconds is more a matter
of I/O blocked than data corruption.
True, but we can know only ex-poste that the storage was just temporarily
unavailable, don't we?
yes but I would like to have an option to say: don't do anything for X
seconds, both at host level and guest level.
X could be 5 seconds, or 10 seconds or 20 seconds.... according to several
needs.
...
If no other host powers on the VM I think there is no risk of data
corruption itself, or at least no more than when you have a physical server
and for some reason the I/O operations to its physical disks (local or on a
SAN) are blocked for some tens of seconds.
IMO, a storage unresponsive for tens of seconds is something which should
be uncommon and very alarming in every circumstances, especially for
physical servers.
What i'm trying to say is that yes, there probabily are ways to sidestep
this behaviour, but I think this is the wrong direction and adds fragility
rather than convenience to the system.
In general I agree with you on this
...
So I think that if I want in any way to modify behavior I have to change
the options so that I keep "report" for both write and read errors on
virtual disks.
Yep. I don't remember what Engine allows. Worst case you can use an hook,
but once again this is making things a bit more fragile.
I'm only experimenting to see possible different options to manage
"temporary" problems at storage level, that often resolve without manual
actions in tens of seconds, sometimes due to uncorrect operations at levels
managed by other teams (network, storage, ecc).
I think the best option is improve the current behaviour: learn why Vdsm
fails to unpause the VM and improve here.
yes, I'm just experimenting on possible options and their pros & cons

I see that on my 4.3.6 environment with plain CentOS 7.7 hosts the qemu-kvm
process is spawned with "werror=stop,rerror=stop" for all virtual disks
I didn't find any related option in VM edit page

In my Fedora 30 when I start a VM (with virt-manager or "virsh start") I
see that the options are not present in command line and based on qemu-kvm
manual page:
"
The default setting is werror=enospc and rerror=report
"

In the mean time I created a wrapper script for qemu-kvm that changes
command line

1)
from werror=stop to werror=report
and
from rerror=stop to rerror=report

This seems worse, in the sense that the VM is not paused at all, as
expected, but strange behavior inside it
From host point of view:
[root@ov300 ~]# virsh -r list
 Id    Name                           State
----------------------------------------------------
 7     mydbsrv                        running

I suddenly get in VM /var/log/messsages something like

Oct 10 12:42:55 mydbsrv kernel: sd 2:0:0:1: [sdc] FAILED Result:
hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 10 12:42:55 mydbsrv kernel: sd 2:0:0:1: [sdc] FAILED Result:
hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 10 12:42:55 mydbsrv kernel: sd 2:0:0:1: [sdc] Sense Key : Aborted
Command [current]
Oct 10 12:42:55 mydbsrv kernel: sd 2:0:0:1: [sdc] Add. Sense: I/O process
terminated
Oct 10 12:42:55 mydbsrv kernel: sd 2:0:0:1: [sdc] CDB: Write(10) 2a 00 03
07 a8 78 00 00 08 00
Oct 10 12:42:55 mydbsrv kernel: blk_update_request: I/O error, dev sdc,
sector 50833528
Oct 10 12:42:55 mydbsrv kernel: EXT4-fs warning (device dm-3):
ext4_end_bio:322: I/O error -5 writing to inode 1573304 (offset 0 size 0
starting block 6353935)
Oct 10 12:42:55 mydbsrv kernel: Buffer I/O error on device dm-3, logical
block 6353935
Oct 10 12:42:55 mydbsrv kernel: sd 2:0:0:1: [sdc] FAILED Result:
hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 10 12:42:55 mydbsrv kernel: sd 2:0:0:1: [sdc] Sense Key : Aborted
Command [current]
Oct 10 12:42:55 mydbsrv kernel: sd 2:0:0:1: [sdc] Add. Sense: I/O process
terminated
Oct 10 12:42:55 mydbsrv kernel: sd 2:0:0:1: [sdc] CDB: Write(10) 2a 00 03
07 a8 98 00 00 08 00
Oct 10 12:42:55 mydbsrv kernel: blk_update_request: I/O error, dev sdc,
sector 50833560
Oct 10 12:42:55 mydbsrv kernel: EXT4-fs warning (device dm-3):
ext4_end_bio:322: I/O error -5 writing to inode 1573308 (offset 0 size 0
starting block 6353939)
...

and only shell builtin commands apparently working inside VM making
necessary anyway a power off (from engine) and power on

[root@mydbsrv ~]# uptime
-bash: uptime: command not found
[root@mydbsrv ~]# df -h
-bash: df: command not found
[root@mydbsrv ~]# id
-bash: id: command not found
[root@mydbsrv ~]# ll
-bash: ls: command not found
[root@mydbsrv ~]#
[root@mydbsrv ~]# jobs
[root@mydbsrv ~]# ps
-bash: ps: command not found
[root@mydbsrv ~]# sync
-bash: sync: command not found
[root@mydbsrv ~]# pwd
/root
[root@mydbsrv ~]# ls
-bash: ls: command not found
[root@mydbsrv ~]# /bin/ls
-bash: /bin/ls: Input/output error
[root@mydbsrv ~]# type mount
-bash: type: mount: not found
[root@mydbsrv ~]# /bin/mount -o remount,rw /myfs
-bash: /bin/mount: Input/output error
[root@mydbsrv ~]# tail /var/log/messages
-bash: tail: command not found
[root@mydbsrv ~]# cat /var/log/messages
-bash: cat: command not found
[root@mydbsrv ~]# echo some_word
some_word
[root@mydbsrv ~]#

even after storage accessible again the wrong behavior continues.

2)
from werror=stop to werror=ignore
and
from rerror=stop to rerror=ignore

This is the more non-intrusive approach in respect of VM in my opinion,
letting it discover itself there is a problem.
In this case I get inside /var/log/messages of VM

Oct 10 12:54:00 mydbsrv chronyd[4133]: Selected source 131.175.12.3
Oct 10 12:54:00 mydbsrv chronyd[4133]: System clock wrong by 1.065994
seconds, adjustment started
Oct 10 12:54:00 mydbsrv systemd: Time has been changed
Oct 10 12:54:00 mydbsrv chronyd[4133]: System clock was stepped by 1.065994
seconds
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-3):
ext4_mb_generate_buddy:757: group 289, block bitmap and bg descriptor
inconsistent: 30748 vs 28302 free clusters
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-3):
ext4_mb_generate_buddy:757: group 290, block bitmap and bg descriptor
inconsistent: 31999 vs 32768 free clusters
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-3):
ext4_mb_generate_buddy:757: group 291, block bitmap and bg descriptor
inconsistent: 28880 vs 32768 free clusters
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-3):
ext4_mb_generate_buddy:757: group 292, block bitmap and bg descriptor
inconsistent: 32478 vs 32768 free clusters
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-3):
ext4_mb_generate_buddy:757: group 293, block bitmap and bg descriptor
inconsistent: 32698 vs 32768 free clusters
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-3):
ext4_mb_generate_buddy:757: group 294, block bitmap and bg descriptor
inconsistent: 32151 vs 32768 free clusters
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-3):
ext4_mb_generate_buddy:757: group 295, block bitmap and bg descriptor
inconsistent: 31925 vs 32768 free clusters
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-3):
ext4_mb_generate_buddy:757: group 296, block bitmap and bg descriptor
inconsistent: 32468 vs 32768 free clusters
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-3):
ext4_mb_generate_buddy:757: group 0, block bitmap and bg descriptor
inconsistent: 32768 vs 1496 free clusters
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-3):
ext4_mb_generate_buddy:757: group 1, block bitmap and bg descriptor
inconsistent: 32768 vs 279 free clusters
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_mb_generate_buddy:757: group 112, block bitmap and bg descriptor
inconsistent: 32767 vs 23733 free clusters
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_m000_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_m002_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:54:23 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_m005_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:54:27 mydbsrv kernel: JBD2: Spotted dirty metadata buffer (dev =
dm-3, blocknr = 0). There's a risk of filesystem corruption in case of
system crash.
Oct 10 12:54:28 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_cjq0_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:54:31 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_pmon_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:54:49 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_reco_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:54:49 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_tmon_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:54:56 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_dia0_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:54:58 mydbsrv kernel: EXT4-fs error: 49 callbacks suppressed
Oct 10 12:54:58 mydbsrv kernel: EXT4-fs error (device dm-3):
ext4_mb_generate_buddy:757: group 192, block bitmap and bg descriptor
inconsistent: 32379 vs 24527 free clusters
Oct 10 12:54:58 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_tt00_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:55:07 mydbsrv chronyd[4133]: Selected source 131.175.12.6
Oct 10 12:55:21 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_gen1_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:55:22 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_mman_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:55:22 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_psp0_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:55:22 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_dbrm_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:55:22 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_pxmn_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:55:22 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_smco_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:55:22 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_mmnl_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:55:26 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_lgwr_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:55:26 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_gen0_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0
Oct 10 12:55:26 mydbsrv kernel: EXT4-fs error (device dm-2):
ext4_find_dest_de:1829: inode #928680: block 3679007: comm ora_mmon_test:
bad entry in directory: rec_len is smaller than minimal - offset=0(0),
inode=0, rec_len=0, name_len=0

and then when the 100 seconds of artificial storage access inhibition ends,
the VM is able again to be accessible.

[root@mydbsrv ~]# id
uid=0(root) gid=0(root) groups=0(root)
[root@mydbsrv ~]# uptime
 12:56:36 up 3 min,  1 user,  load average: 0.28, 0.38, 0.17
[root@mydbsrv ~]#

[root@mydbsrv log]# time dd if=/dev/zero bs=1024k count=10240
of=/myfs/testfile
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 60.785 s, 177 MB/s

real 1m1.771s
user 0m0.016s
sys 0m10.459s
[root@mydbsrv log]#

Obviously, as mentioned in messages, this behavior could potentially lead
to fs/journal corruption...
Gianluca