VM has been paused due to storage I/O problem

Hello, my test environment is composed by 2 old HP blades BL685c G1 (ovmsrv05 and ovmsrv06) and they are connected in a SAN with FC-switches to an old IBM DS4700 storage array. Apart from being old, they seem all ok from an hw point of view. I have configured oVirt 4.0.6 and an FCP storage domain. The hosts are plain CentOS 7.3 servers fully updated. It is not an hosted engine environment: the manager is a vm outside of the cluster. I have configured power mgmt on both and it works good. I have at the moment only one VM for test and it is doing quite nothing. Starting point: ovmsrv05 is in maintenance (since about 2 days) and the VM is running on ovmsrv06. I update qemu-kvm package on ovmsrv05 and then I restart it from web admin gui: Power Mgmt --> Restart Sequence of events in pane and the problem in subject: Jan 31, 2017 10:29:43 AM Host ovmsrv05 power management was verified successfully. Jan 31, 2017 10:29:43 AM Status of host ovmsrv05 was set to Up. Jan 31, 2017 10:29:38 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:29:29 AM Activation of host ovmsrv05 initiated by admin@internal-authz. Jan 31, 2017 10:28:05 AM VM ol65 has recovered from paused back to up. Jan 31, 2017 10:27:55 AM VM ol65 has been paused due to storage I/O problem. Jan 31, 2017 10:27:55 AM VM ol65 has been paused. Jan 31, 2017 10:25:52 AM Host ovmsrv05 was restarted by admin@internal-authz. Jan 31, 2017 10:25:52 AM Host ovmsrv05 was started by admin@internal-authz. Jan 31, 2017 10:25:52 AM Power management start of Host ovmsrv05 succeeded. Jan 31, 2017 10:25:50 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:25:37 AM Executing power management start on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:25:37 AM Power management start of Host ovmsrv05 initiated. Jan 31, 2017 10:25:37 AM Auto fence for host ovmsrv05 was started. Jan 31, 2017 10:25:37 AM All VMs' status on Non Responsive Host ovmsrv05 were changed to 'Down' by admin@internal-authz Jan 31, 2017 10:25:36 AM Host ovmsrv05 was stopped by admin@internal-authz. Jan 31, 2017 10:25:36 AM Power management stop of Host ovmsrv05 succeeded. Jan 31, 2017 10:25:34 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:25:15 AM Executing power management stop on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:25:15 AM Power management stop of Host ovmsrv05 initiated. Jan 31, 2017 10:25:12 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Watching the timestamps, the culprit seems the reboot time of ovmsrv05 that detects some LUNs in owned state and other ones in unowned Full messages of both hosts here: https://drive.google.com/file/d/0BwoPbcrMv8mvekZQT1pjc0NMRlU/view?usp=sharin... and https://drive.google.com/file/d/0BwoPbcrMv8mvcjBCYVdFZWdXTms/view?usp=sharin... At this time there are 4 LUNs globally seen by the two hosts but only 1 of them is currently configured as the only storage domain in oVirt cluster. [root@ovmsrv05 ~]# multipath -l | grep ^36 3600a0b8000299aa80000d08b55014119 dm-5 IBM ,1814 FAStT 3600a0b80002999020000cd3c5501458f dm-3 IBM ,1814 FAStT 3600a0b80002999020000ccf855011198 dm-2 IBM ,1814 FAStT 3600a0b8000299aa80000d08955014098 dm-4 IBM ,1814 FAStT the configured one: [root@ovmsrv05 ~]# multipath -l 3600a0b8000299aa80000d08b55014119 3600a0b8000299aa80000d08b55014119 dm-5 IBM ,1814 FAStT size=4.0T features='0' hwhandler='1 rdac' wp=rw |-+- policy='service-time 0' prio=0 status=active | |- 0:0:1:3 sdl 8:176 active undef running | `- 2:0:1:3 sdp 8:240 active undef running `-+- policy='service-time 0' prio=0 status=enabled |- 0:0:0:3 sdd 8:48 active undef running `- 2:0:0:3 sdi 8:128 active undef running In mesages of booting node, arounf the problem registered by the storage: [root@ovmsrv05 ~]# grep owned /var/log/messages Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:1: rdac: LUN 1 (RDAC) (owned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:2: rdac: LUN 2 (RDAC) (owned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:3: rdac: LUN 3 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:1: rdac: LUN 1 (RDAC) (owned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:4: rdac: LUN 4 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:2: rdac: LUN 2 (RDAC) (owned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:1: rdac: LUN 1 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:3: rdac: LUN 3 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:4: rdac: LUN 4 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:2: rdac: LUN 2 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:1: rdac: LUN 1 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:3: rdac: LUN 3 (RDAC) (owned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:2: rdac: LUN 2 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:4: rdac: LUN 4 (RDAC) (owned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:3: rdac: LUN 3 (RDAC) (owned) Jan 31 10:27:39 ovmsrv05 kernel: scsi 2:0:1:4: rdac: LUN 4 (RDAC) (owned) I don't know exactly the meaning of owned/unowned in the output above.. Possibly it detects the 0:0:1:3 and 2:0:1:3 paths (those of the active group) as "owned" and this could have created problems with the active node? On active node strangely I don't loose all the paths, but the VM has been paused anyway [root@ovmsrv06 log]# grep "remaining active path" /var/log/messages Jan 31 10:27:48 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 3 Jan 31 10:27:49 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 2 Jan 31 10:27:56 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 3 Jan 31 10:27:56 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 2 Jan 31 10:27:56 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 1 Jan 31 10:27:57 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 2 Jan 31 10:28:01 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 3 Jan 31 10:28:01 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 4 I'm not an expert of this storage array in particular, and of the rdac hardware handler in general. What I see is that multipath.conf on both nodes: # VDSM REVISION 1.3 defaults { polling_interval 5 no_path_retry fail user_friendly_names no flush_on_last_del yes fast_io_fail_tmo 5 dev_loss_tmo 30 max_fds 4096 } devices { device { # These settings overrides built-in devices settings. It does not apply # to devices without built-in settings (these use the settings in the # "defaults" section), or to devices defined in the "devices" section. # Note: This is not available yet on Fedora 21. For more info see # https://bugzilla.redhat.com/1253799 all_devs yes no_path_retry fail } } beginning of /proc/scsi/scsi [root@ovmsrv06 ~]# cat /proc/scsi/scsi Attached devices: Host: scsi1 Channel: 01 Id: 00 Lun: 00 Vendor: HP Model: LOGICAL VOLUME Rev: 1.86 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi0 Channel: 00 Id: 00 Lun: 01 Vendor: IBM Model: 1814 FAStT Rev: 0916 Type: Direct-Access ANSI SCSI revision: 05 ... To get default acquired config for this storage: multpathd -k
show config
I can see: device { vendor "IBM" product "^1814" product_blacklist "Universal Xport" path_grouping_policy "group_by_prio" path_checker "rdac" features "0" hardware_handler "1 rdac" prio "rdac" failback immediate rr_weight "uniform" no_path_retry "fail" } and defaults { verbosity 2 polling_interval 5 max_polling_interval 20 reassign_maps "yes" multipath_dir "/lib64/multipath" path_selector "service-time 0" path_grouping_policy "failover" uid_attribute "ID_SERIAL" prio "const" prio_args "" features "0" path_checker "directio" alias_prefix "mpath" failback "manual" rr_min_io 1000 rr_min_io_rq 1 max_fds 4096 rr_weight "uniform" no_path_retry "fail" queue_without_daemon "no" flush_on_last_del "yes" user_friendly_names "no" fast_io_fail_tmo 5 dev_loss_tmo 30 bindings_file "/etc/multipath/bindings" wwids_file /etc/multipath/wwids log_checker_err always find_multipaths no retain_attached_hw_handler no detect_prio no hw_str_match no force_sync no deferred_remove no ignore_new_boot_devs no skip_kpartx no config_dir "/etc/multipath/conf.d" delay_watch_checks no delay_wait_checks no retrigger_tries 3 retrigger_delay 10 missing_uev_wait_timeout 30 new_bindings_in_boot no } Any hint on how to tune multipath.conf so that a powering on server doesn't create problems to running VMs? Thanks in advance, Gianluca

This is a multi-part message in MIME format. --------------D335D695CB22377475546514 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable exactly the same issue by there with FC EMC domain storage... Le 31/01/2017 =E0 15:20, Gianluca Cecchi a =E9crit :
Hello, my test environment is composed by 2 old HP blades BL685c G1 (ovmsrv05=20 and ovmsrv06) and they are connected in a SAN with FC-switches to an=20 old IBM DS4700 storage array. Apart from being old, they seem all ok from an hw point of view. I have configured oVirt 4.0.6 and an FCP storage domain. The hosts are plain CentOS 7.3 servers fully updated. It is not an hosted engine environment: the manager is a vm outside of=20 the cluster. I have configured power mgmt on both and it works good.
I have at the moment only one VM for test and it is doing quite nothin= g.
Starting point: ovmsrv05 is in maintenance (since about 2 days) and=20 the VM is running on ovmsrv06. I update qemu-kvm package on ovmsrv05 and then I restart it from web=20 admin gui: Power Mgmt --> Restart
Sequence of events in pane and the problem in subject: Jan 31, 2017 10:29:43 AM Host ovmsrv05 power management was verified=20 successfully. Jan 31, 2017 10:29:43 AM Status of host ovmsrv05 was set to Up. Jan 31, 2017 10:29:38 AM Executing power management status on Host=20 ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:29:29 AM Activation of host ovmsrv05 initiated by=20 admin@internal-authz. Jan 31, 2017 10:28:05 AM VM ol65 has recovered from paused back to up. Jan 31, 2017 10:27:55 AM VM ol65 has been paused due to storage I/O=20 problem. Jan 31, 2017 10:27:55 AM VM ol65 has been paused. Jan 31, 2017 10:25:52 AM Host ovmsrv05 was restarted by=20 admin@internal-authz. Jan 31, 2017 10:25:52 AM Host ovmsrv05 was started by=20 admin@internal-authz. Jan 31, 2017 10:25:52 AM Power management start of Host ovmsrv05=20 succeeded. Jan 31, 2017 10:25:50 AM Executing power management status on Host=20 ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:25:37 AM Executing power management start on Host=20 ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:25:37 AM Power management start of Host ovmsrv05=20 initiated. Jan 31, 2017 10:25:37 AM Auto fence for host ovmsrv05 was started. Jan 31, 2017 10:25:37 AM All VMs' status on Non Responsive Host=20 ovmsrv05 were changed to 'Down' by admin@internal-authz Jan 31, 2017 10:25:36 AM Host ovmsrv05 was stopped by=20 admin@internal-authz. Jan 31, 2017 10:25:36 AM Power management stop of Host ovmsrv05 succeed= ed. Jan 31, 2017 10:25:34 AM Executing power management status on Host=20 ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:25:15 AM Executing power management stop on Host=20 ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:25:15 AM Power management stop of Host ovmsrv05 initiat= ed. Jan 31, 2017 10:25:12 AM Executing power management status on Host=20 ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.
Watching the timestamps, the culprit seems the reboot time of ovmsrv05=20 that detects some LUNs in owned state and other ones in unowned Full messages of both hosts here: https://drive.google.com/file/d/0BwoPbcrMv8mvekZQT1pjc0NMRlU/view?usp=3D= sharing and https://drive.google.com/file/d/0BwoPbcrMv8mvcjBCYVdFZWdXTms/view?usp=3D= sharing
At this time there are 4 LUNs globally seen by the two hosts but only=20 1 of them is currently configured as the only storage domain in oVirt=20 cluster.
[root@ovmsrv05 ~]# multipath -l | grep ^36 3600a0b8000299aa80000d08b55014119 dm-5 IBM ,1814 FAStT 3600a0b80002999020000cd3c5501458f dm-3 IBM ,1814 FAStT 3600a0b80002999020000ccf855011198 dm-2 IBM ,1814 FAStT 3600a0b8000299aa80000d08955014098 dm-4 IBM ,1814 FAStT
the configured one: [root@ovmsrv05 ~]# multipath -l 3600a0b8000299aa80000d08b55014119 3600a0b8000299aa80000d08b55014119 dm-5 IBM ,1814 FAStT size=3D4.0T features=3D'0' hwhandler=3D'1 rdac' wp=3Drw |-+- policy=3D'service-time 0' prio=3D0 status=3Dactive | |- 0:0:1:3 sdl 8:176 active undef running | `- 2:0:1:3 sdp 8:240 active undef running `-+- policy=3D'service-time 0' prio=3D0 status=3Denabled |- 0:0:0:3 sdd 8:48 active undef running `- 2:0:0:3 sdi 8:128 active undef running
In mesages of booting node, arounf the problem registered by the storag= e: [root@ovmsrv05 ~]# grep owned /var/log/messages Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:1: rdac: LUN 1 (RDAC) (owne= d) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:2: rdac: LUN 2 (RDAC) (owne= d) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:3: rdac: LUN 3 (RDAC)=20 (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:1: rdac: LUN 1 (RDAC) (owne= d) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:4: rdac: LUN 4 (RDAC)=20 (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:2: rdac: LUN 2 (RDAC) (owne= d) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:1: rdac: LUN 1 (RDAC)=20 (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:3: rdac: LUN 3 (RDAC)=20 (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:4: rdac: LUN 4 (RDAC)=20 (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:2: rdac: LUN 2 (RDAC)=20 (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:1: rdac: LUN 1 (RDAC)=20 (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:3: rdac: LUN 3 (RDAC) (owne= d) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:2: rdac: LUN 2 (RDAC)=20 (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:4: rdac: LUN 4 (RDAC) (owne= d) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:3: rdac: LUN 3 (RDAC) (owne= d) Jan 31 10:27:39 ovmsrv05 kernel: scsi 2:0:1:4: rdac: LUN 4 (RDAC) (owne= d)
I don't know exactly the meaning of owned/unowned in the output above.. Possibly it detects the 0:0:1:3 and 2:0:1:3 paths (those of the active=20 group) as "owned" and this could have created problems with the active=20 node?
On active node strangely I don't loose all the paths, but the VM has=20 been paused anyway
[root@ovmsrv06 log]# grep "remaining active path" /var/log/messages Jan 31 10:27:48 ovmsrv06 multipathd:=20 3600a0b8000299aa80000d08b55014119: remaining active paths: 3 Jan 31 10:27:49 ovmsrv06 multipathd:=20 3600a0b8000299aa80000d08b55014119: remaining active paths: 2 Jan 31 10:27:56 ovmsrv06 multipathd:=20 3600a0b8000299aa80000d08b55014119: remaining active paths: 3 Jan 31 10:27:56 ovmsrv06 multipathd:=20 3600a0b8000299aa80000d08b55014119: remaining active paths: 2 Jan 31 10:27:56 ovmsrv06 multipathd:=20 3600a0b8000299aa80000d08b55014119: remaining active paths: 1 Jan 31 10:27:57 ovmsrv06 multipathd:=20 3600a0b8000299aa80000d08b55014119: remaining active paths: 2 Jan 31 10:28:01 ovmsrv06 multipathd:=20 3600a0b8000299aa80000d08b55014119: remaining active paths: 3 Jan 31 10:28:01 ovmsrv06 multipathd:=20 3600a0b8000299aa80000d08b55014119: remaining active paths: 4
I'm not an expert of this storage array in particular, and of the rdac=20 hardware handler in general.
What I see is that multipath.conf on both nodes:
# VDSM REVISION 1.3
defaults { polling_interval 5 no_path_retry fail user_friendly_names no flush_on_last_del yes fast_io_fail_tmo 5 dev_loss_tmo 30 max_fds 4096 }
devices { device { # These settings overrides built-in devices settings. It does=20 not apply # to devices without built-in settings (these use the settings=20 in the # "defaults" section), or to devices defined in the "devices"=20 section. # Note: This is not available yet on Fedora 21. For more info s= ee # https://bugzilla.redhat.com/1253799 all_devs yes no_path_retry fail } }
beginning of /proc/scsi/scsi
[root@ovmsrv06 ~]# cat /proc/scsi/scsi Attached devices: Host: scsi1 Channel: 01 Id: 00 Lun: 00 Vendor: HP Model: LOGICAL VOLUME Rev: 1.86 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi0 Channel: 00 Id: 00 Lun: 01 Vendor: IBM Model: 1814 FAStT Rev: 0916 Type: Direct-Access ANSI SCSI revision: 05 ...
To get default acquired config for this storage:
multpathd -k
show config
I can see:
device { vendor "IBM" product "^1814" product_blacklist "Universal Xport" path_grouping_policy "group_by_prio" path_checker "rdac" features "0" hardware_handler "1 rdac" prio "rdac" failback immediate rr_weight "uniform" no_path_retry "fail" }
and
defaults { verbosity 2 polling_interval 5 max_polling_interval 20 reassign_maps "yes" multipath_dir "/lib64/multipath" path_selector "service-time 0" path_grouping_policy "failover" uid_attribute "ID_SERIAL" prio "const" prio_args "" features "0" path_checker "directio" alias_prefix "mpath" failback "manual" rr_min_io 1000 rr_min_io_rq 1 max_fds 4096 rr_weight "uniform" no_path_retry "fail" queue_without_daemon "no" flush_on_last_del "yes" user_friendly_names "no" fast_io_fail_tmo 5 dev_loss_tmo 30 bindings_file "/etc/multipath/bindings" wwids_file /etc/multipath/wwids log_checker_err always find_multipaths no retain_attached_hw_handler no detect_prio no hw_str_match no force_sync no deferred_remove no ignore_new_boot_devs no skip_kpartx no config_dir "/etc/multipath/conf.d" delay_watch_checks no delay_wait_checks no retrigger_tries 3 retrigger_delay 10 missing_uev_wait_timeout 30 new_bindings_in_boot no }
Any hint on how to tune multipath.conf so that a powering on server=20 doesn't create problems to running VMs?
Thanks in advance, Gianluca
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
--=20 Nathana=EBl Blanchet Supervision r=E9seau P=F4le Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 =09 T=E9l. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr --------------D335D695CB22377475546514 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable <html> <head> <meta content=3D"text/html; charset=3Dwindows-1252" http-equiv=3D"Content-Type"> </head> <body bgcolor=3D"#FFFFFF" text=3D"#000000"> <p>exactly the same issue by there with FC EMC domain storage...</p> <br> <div class=3D"moz-cite-prefix">Le 31/01/2017 =E0 15:20, Gianluca Cecc= hi a =E9crit=A0:<br> </div> <blockquote cite=3D"mid:CAG2kNCxWKBRLZ43OTXfAVJUPpLd3vNbN1ckw3sq2ZCr=3DtY+CtQ@mail.gm= ail.com" type=3D"cite"> <div dir=3D"ltr">Hello, <div>my test environment is composed by 2 old HP blades BL685c G1 (ovmsrv05 and ovmsrv06) and they are connected in a SAN with FC-switches to an old IBM DS4700 storage array.</div> <div>Apart from being old, they seem all ok from an hw point of view.</div> <div>I have configured oVirt 4.0.6 and an FCP storage domain.</di= v> <div>The hosts are plain CentOS 7.3 servers fully updated.</div> <div>It is not an hosted engine environment: the manager is a vm outside of the cluster.</div> <div>I have configured power mgmt on both and it works good.</div=
<div><br> </div> <div>I have at the moment =A0only one VM for test and it is doing quite nothing.<br> </div> <div><br> </div> <div>Starting point: ovmsrv05 is in maintenance (since about 2 days) and the VM is running on ovmsrv06.</div> <div>I update qemu-kvm package on ovmsrv05 and then I restart it from web admin gui:</div> <div>Power Mgmt --> Restart</div> <div><br> </div> <div>Sequence of events in pane and the problem in subject:</div> <div> <div>Jan 31, 2017 10:29:43 AM Host ovmsrv05 power management was verified successfully.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:29:43 AM Status of host ovmsrv05 was set to Up.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:29:38 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:29:29 AM Activation of host ovmsrv05 initiated by admin@internal-authz.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:28:05 AM VM ol65 has recovered from paused back to up.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:27:55 AM VM ol65 has been paused due to storage I/O problem.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:27:55 AM VM ol65 has been paused.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:52 AM Host ovmsrv05 was restarted by admin@internal-authz.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:52 AM Host ovmsrv05 was started by admin@internal-authz.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:52 AM Power management start of Host ovmsrv05 succeeded.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:50 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:37 AM Executing power management start on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:37 AM Power management start of Host ovmsrv05 initiated.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:37 AM Auto fence for host ovmsrv05 was started.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:37 AM All VMs' status on Non Responsive Host ovmsrv05 were changed to 'Down' by admin@internal-authz</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:36 AM Host ovmsrv05 was stopped by admin@internal-authz.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:36 AM Power management stop of Host ovmsrv05 succeeded.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:34 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:15 AM Executing power management stop on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:15 AM Power management stop of Host ovmsrv05 initiated.</div> <div><span class=3D"gmail-Apple-tab-span" style=3D"white-space:= pre"> </span></div> <div>Jan 31, 2017 10:25:12 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.</div> </div> <div><br> </div> <div>Watching the timestamps, the culprit seems the reboot time of ovmsrv05 that detects some LUNs in owned state and other ones in unowned</div> <div>Full messages of both hosts here:</div> <div><a moz-do-not-send=3D"true" href=3D"https://drive.google.com/file/d/0BwoPbcrMv8mvekZQT1pjc0NMRlU/view= ?usp=3Dsharing">https://drive.google.com/file/d/0BwoPbcrMv8mvekZQT1pjc0NM= RlU/view?usp=3Dsharing</a><br> </div> <div>and</div> <div><a moz-do-not-send=3D"true" href=3D"https://drive.google.com/file/d/0BwoPbcrMv8mvcjBCYVdFZWdXTms/view= ?usp=3Dsharing">https://drive.google.com/file/d/0BwoPbcrMv8mvcjBCYVdFZWdX= Tms/view?usp=3Dsharing</a><br> </div> <div><br> </div> <div>At this time there are 4 LUNs globally seen by the two hosts but only 1 of them is currently configured as the only storage domain in oVirt cluster.</div> <div><br> </div> <div> <div>[root@ovmsrv05 ~]# multipath -l | grep ^36</div> <div>3600a0b8000299aa80000d08b55014119 dm-5 IBM =A0 =A0 ,1814 =A0= =A0 =A0FAStT=A0</div> <div>3600a0b80002999020000cd3c5501458f dm-3 IBM =A0 =A0 ,1814 =A0= =A0 =A0FAStT=A0</div> <div>3600a0b80002999020000ccf855011198 dm-2 IBM =A0 =A0 ,1814 =A0= =A0 =A0FAStT=A0</div> <div>3600a0b8000299aa80000d08955014098 dm-4 IBM =A0 =A0 ,1814 =A0= =A0 =A0FAStT=A0</div> </div> <div><br> </div> <div>the configured one:</div> <div> <div>[root@ovmsrv05 ~]# multipath -l 3600a0b8000299aa80000d08b55014119</div> <div>3600a0b8000299aa80000d08b55014119 dm-5 IBM =A0 =A0 ,1814 =A0= =A0 =A0FAStT=A0</div> <div>size=3D4.0T features=3D'0' hwhandler=3D'1 rdac' wp=3Drw</d= iv> <div>|-+- policy=3D'service-time 0' prio=3D0 status=3Dactive</d= iv> <div>| |- 0:0:1:3 sdl 8:176 active undef running</div> <div>| `- 2:0:1:3 sdp 8:240 active undef running</div> <div>`-+- policy=3D'service-time 0' prio=3D0 status=3Denabled</= div> <div>=A0 |- 0:0:0:3 sdd 8:48 =A0active undef running</div> <div>=A0 `- 2:0:0:3 sdi 8:128 active undef running</div> </div> <div><br> </div> <div>In mesages of booting node, arounf the problem registered by the storage:</div> <div> <div>[root@ovmsrv05 ~]# grep owned /var/log/messages</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:1: rdac: LUN 1 (RDAC) (owned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:2: rdac: LUN 2 (RDAC) (owned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:3: rdac: LUN 3 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:1: rdac: LUN 1 (RDAC) (owned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:4: rdac: LUN 4 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:2: rdac: LUN 2 (RDAC) (owned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:1: rdac: LUN 1 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:3: rdac: LUN 3 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:4: rdac: LUN 4 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:2: rdac: LUN 2 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:1: rdac: LUN 1 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:3: rdac: LUN 3 (RDAC) (owned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:2: rdac: LUN 2 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:4: rdac: LUN 4 (RDAC) (owned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:3: rdac: LUN 3 (RDAC) (owned)</div> <div>Jan 31 10:27:39 ovmsrv05 kernel: scsi 2:0:1:4: rdac: LUN 4 (RDAC) (owned)</div> </div> <div><br> </div> <div>I don't know exactly the meaning of owned/unowned in the output above..</div> <div>Possibly it detects the 0:0:1:3 and 2:0:1:3 paths (those of the active group) as "owned" and this could have created problems with the active node?</div> <div><br> </div> <div>On active node strangely I don't loose all the paths, but the VM has been paused anyway</div> <div><br> </div> <div> <div>[root@ovmsrv06 log]# grep "remaining active path"=A0/var/log/messages=A0</div> <div>Jan 31 10:27:48 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 3<= /div> <div>Jan 31 10:27:49 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 2<= /div> <div>Jan 31 10:27:56 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 3<= /div> <div>Jan 31 10:27:56 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 2<= /div> <div>Jan 31 10:27:56 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 1<= /div> <div>Jan 31 10:27:57 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 2<= /div> <div>Jan 31 10:28:01 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 3<= /div> <div>Jan 31 10:28:01 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 4<= /div> </div> <div><br> </div> <div>I'm not an expert of this storage array in particular, and of the rdac hardware handler in general.</div> <div><br> </div> <div>What I see is that multipath.conf on both nodes:</div> <div><br> </div> <div> <div># VDSM REVISION 1.3</div> <div><br> </div> <div>defaults {</div> <div>=A0 =A0 polling_interval =A0 =A0 =A0 =A0 =A0 =A05</div> <div>=A0 =A0 no_path_retry =A0 =A0 =A0 =A0 =A0 =A0 =A0 fail</di= v> <div>=A0 =A0 user_friendly_names =A0 =A0 =A0 =A0 no</div> <div>=A0 =A0 flush_on_last_del =A0 =A0 =A0 =A0 =A0 yes</div> <div>=A0 =A0 fast_io_fail_tmo =A0 =A0 =A0 =A0 =A0 =A05</div> <div>=A0 =A0 dev_loss_tmo =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A030</di= v> <div>=A0 =A0 max_fds =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 40= 96</div> <div>}</div> <div><br> </div> <div><br> </div> <div>devices {</div> <div>=A0 =A0 device {</div> <div>=A0 =A0 =A0 =A0 # These settings overrides built-in device= s settings. It does not apply</div> <div>=A0 =A0 =A0 =A0 # to devices without built-in settings (th= ese use the settings in the</div> <div>=A0 =A0 =A0 =A0 # "defaults" section), or to devices defin= ed in the "devices" section.</div> <div>=A0 =A0 =A0 =A0 # Note: This is not available yet on Fedor= a 21. For more info see</div> <div>=A0 =A0 =A0 =A0 # <a moz-do-not-send=3D"true" href=3D"https://bugzilla.redhat.com/1253799">https://bugzil= la.redhat.com/1253799</a></div> <div>=A0 =A0 =A0 =A0 all_devs =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ye= s</div> <div>=A0 =A0 =A0 =A0 no_path_retry =A0 =A0 =A0 =A0 =A0 fail</di= v> <div>=A0 =A0 }</div> <div>}</div> </div> <div><br> </div> <div><br> </div> <div> <div>beginning of /proc/scsi/scsi</div> <div><br> </div> <div>[root@ovmsrv06 ~]# cat /proc/scsi/scsi=A0</div> <div>Attached devices:</div> <div>Host: scsi1 Channel: 01 Id: 00 Lun: 00</div> <div>=A0 Vendor: HP =A0 =A0 =A0 Model: LOGICAL VOLUME =A0 Rev: = 1.86</div> <div>=A0 Type: =A0 Direct-Access =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0ANSI =A0SCSI revision: 05</div> <div>Host: scsi0 Channel: 00 Id: 00 Lun: 01</div> <div>=A0 Vendor: IBM =A0 =A0 =A0Model: 1814 =A0 =A0 =A0FAStT =A0= Rev: 0916</div> <div>=A0 Type: =A0 Direct-Access =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0ANSI =A0SCSI revision: 05</div> </div> <div>...</div> <div><br> </div> <div>To get default acquired config for this storage:</div> <div> <div><br> </div> <div>multpathd -k</div> <div>> show config</div> <div><br> </div> <div>I can see:</div> <div><br> </div> <div>=A0 =A0 =A0 =A0 device {</div> <div>=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 vendor "IBM"</div> <div>=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 product "^1814"</div> <div>=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 product_blacklist "Univers= al Xport"</div> <div>=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 path_grouping_policy "grou= p_by_prio"</div> <div>=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 path_checker "rdac"</div> <div>=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 features "0"</div> <div>=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 hardware_handler "1 rdac"<= /div> <div>=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 prio "rdac"</div> <div>=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 failback immediate</div> <div>=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rr_weight "uniform"</div> <div>=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 no_path_retry "fail"</div> <div>=A0 =A0 =A0 =A0 }</div> <div><br> </div> <div><br> </div> <div>and</div> <div><br> </div> <div>defaults {</div> <div>=A0 =A0 =A0 =A0 verbosity 2</div> <div>=A0 =A0 =A0 =A0 polling_interval 5</div> <div>=A0 =A0 =A0 =A0 max_polling_interval 20</div> <div>=A0 =A0 =A0 =A0 reassign_maps "yes"</div> <div>=A0 =A0 =A0 =A0 multipath_dir "/lib64/multipath"</div> <div>=A0 =A0 =A0 =A0 path_selector "service-time 0"</div> <div>=A0 =A0 =A0 =A0 path_grouping_policy "failover"</div> <div>=A0 =A0 =A0 =A0 uid_attribute "ID_SERIAL"</div> <div>=A0 =A0 =A0 =A0 prio "const"</div> <div>=A0 =A0 =A0 =A0 prio_args ""</div> <div>=A0 =A0 =A0 =A0 features "0"</div> <div>=A0 =A0 =A0 =A0 path_checker "directio"</div> <div>=A0 =A0 =A0 =A0 alias_prefix "mpath"</div> <div>=A0 =A0 =A0 =A0 failback "manual"</div> <div>=A0 =A0 =A0 =A0 rr_min_io 1000</div> <div>=A0 =A0 =A0 =A0 rr_min_io_rq 1</div> <div>=A0 =A0 =A0 =A0 max_fds 4096</div> <div>=A0 =A0 =A0 =A0 rr_weight "uniform"</div> <div>=A0 =A0 =A0 =A0 no_path_retry "fail"</div> <div>=A0 =A0 =A0 =A0 queue_without_daemon "no"</div> <div>=A0 =A0 =A0 =A0 flush_on_last_del "yes"</div> <div>=A0 =A0 =A0 =A0 user_friendly_names "no"</div> <div>=A0 =A0 =A0 =A0 fast_io_fail_tmo 5</div> <div>=A0 =A0 =A0 =A0 dev_loss_tmo 30</div> <div>=A0 =A0 =A0 =A0 bindings_file "/etc/multipath/bindings"</d= iv> <div>=A0 =A0 =A0 =A0 wwids_file /etc/multipath/wwids</div> <div>=A0 =A0 =A0 =A0 log_checker_err always</div> <div>=A0 =A0 =A0 =A0 find_multipaths no</div> <div>=A0 =A0 =A0 =A0 retain_attached_hw_handler no</div> <div>=A0 =A0 =A0 =A0 detect_prio no</div> <div>=A0 =A0 =A0 =A0 hw_str_match no</div> <div>=A0 =A0 =A0 =A0 force_sync no</div> <div>=A0 =A0 =A0 =A0 deferred_remove no</div> <div>=A0 =A0 =A0 =A0 ignore_new_boot_devs no</div> <div>=A0 =A0 =A0 =A0 skip_kpartx no</div> <div>=A0 =A0 =A0 =A0 config_dir "/etc/multipath/conf.d"</div> <div>=A0 =A0 =A0 =A0 delay_watch_checks no</div> <div>=A0 =A0 =A0 =A0 delay_wait_checks no</div> <div>=A0 =A0 =A0 =A0 retrigger_tries 3</div> <div>=A0 =A0 =A0 =A0 retrigger_delay 10</div> <div>=A0 =A0 =A0 =A0 missing_uev_wait_timeout 30</div> <div>=A0 =A0 =A0 =A0 new_bindings_in_boot no</div> <div>}</div> <div><br> </div> </div> <div>Any hint on how to tune multipath.conf so that a powering on server doesn't create problems to running VMs?</div> <div><br> </div> <div>Thanks in advance,</div> <div>Gianluca</div> </div> <br> <fieldset class=3D"mimeAttachmentHeader"></fieldset> <br> <pre wrap=3D"">_______________________________________________ Users mailing list <a class=3D"moz-txt-link-abbreviated" href=3D"mailto:Users@ovirt.org">Use= rs@ovirt.org</a> <a class=3D"moz-txt-link-freetext" href=3D"http://lists.ovirt.org/mailman= /listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a> </pre> </blockquote> <br> <pre class=3D"moz-signature" cols=3D"72">--=20 Nathana=EBl Blanchet Supervision r=E9seau P=F4le Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 =09 T=E9l. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 <a class=3D"moz-txt-link-abbreviated" href=3D"mailto:blanchet@abes.fr">bl= anchet@abes.fr</a> </pre> </body> </html> --------------D335D695CB22377475546514--

On Tue, Jan 31, 2017 at 3:23 PM, Nathanaël Blanchet <blanchet@abes.fr> wrote:
exactly the same issue by there with FC EMC domain storage...
I'm trying to mitigate inserting a timeout for my SAN devices but I'm not sure of its effectiveness as CentOS 7 behavior of "multipathd -k" and then "show config" seems different from CentOS 6.x In fact my attempt for multipath.conf is this # VDSM REVISION 1.3 # VDSM PRIVATE defaults { polling_interval 5 no_path_retry fail user_friendly_names no flush_on_last_del yes fast_io_fail_tmo 5 dev_loss_tmo 30 max_fds 4096 } # Remove devices entries when overrides section is available. devices { device { # These settings overrides built-in devices settings. It does not apply # to devices without built-in settings (these use the settings in the # "defaults" section), or to devices defined in the "devices" section. # Note: This is not available yet on Fedora 21. For more info see # https://bugzilla.redhat.com/1253799 all_devs yes no_path_retry fail } device { vendor "IBM" product "^1814" product_blacklist "Universal Xport" path_grouping_policy "group_by_prio" path_checker "rdac" features "0" hardware_handler "1 rdac" prio "rdac" failback immediate rr_weight "uniform" no_path_retry "12" } } So I put exactly the default device config for my IBM/1814 device but no_path_retry set to 12. In CentOS 6.x when you do something like this, "show config" gives you the modified entry only for your device section. Instead in CentOS 7.3 it seems I get anyway the default one for IBM/1814 and also the customized one at the end of the output.... Two facts: - before I could reproduce the problem if I selected Maintenance Power Mgmt ---> Restart (tried 3 times with same behavior) Instead if I executed in separate steps Maintenance Power Mgmt --> Stop wait a moment Power Mgmt --> Start I didn't get problems (tried only one time...) With this "new" multipath config (to be confirmed if in effect, how?) I don't get the VM paused problem even with Restart option of Power Mgmt In active host messages I see these ones when the other reboots: Jan 31 16:50:01 ovmsrv06 systemd: Started Session 705 of user root. Jan 31 16:50:01 ovmsrv06 systemd: Starting Session 705 of user root. Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sde - rdac checker reports path is up Jan 31 16:53:47 ovmsrv06 multipathd: 8:64: reinstated Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1] Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdo - rdac checker reports path is ghost Jan 31 16:53:47 ovmsrv06 multipathd: 8:224: reinstated Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdk - rdac checker reports path is up Jan 31 16:53:47 ovmsrv06 multipathd: 8:160: reinstated Jan 31 16:53:47 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdq - rdac checker reports path is ghost Jan 31 16:53:47 ovmsrv06 multipathd: 65:0: reinstated Jan 31 16:53:48 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT returned with sense 05/91/36 Jan 31 16:53:48 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT returned with sense 05/91/36 Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT completed Jan 31 16:53:49 ovmsrv06 kernel: sd 2:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command Jan 31 16:53:49 ovmsrv06 kernel: sd 2:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT completed Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sde - rdac checker reports path is ghost Jan 31 16:53:52 ovmsrv06 multipathd: 8:64: reinstated Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1] Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdo - rdac checker reports path is up Jan 31 16:53:52 ovmsrv06 multipathd: 8:224: reinstated Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdk - rdac checker reports path is ghost Jan 31 16:53:52 ovmsrv06 multipathd: 8:160: reinstated Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdq - rdac checker reports path is up Jan 31 16:53:52 ovmsrv06 multipathd: 65:0: reinstated But they are not related to the multipath device dedicated to oVirt storage domain in this case.... What lets me be optimistic seems the difference in these lines: before I got Jan 31 10:27:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 0 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1] now I get Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1] multipath 0 1 rdac vs multipath 1 queue_if_no_path 1 rdac Any confirmation? Thanks in advance

On Tue, Jan 31, 2017 at 6:09 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Tue, Jan 31, 2017 at 3:23 PM, Nathanaël Blanchet <blanchet@abes.fr> wrote:
exactly the same issue by there with FC EMC domain storage...
I'm trying to mitigate inserting a timeout for my SAN devices but I'm not sure of its effectiveness as CentOS 7 behavior of "multipathd -k" and then "show config" seems different from CentOS 6.x In fact my attempt for multipath.conf is this
# VDSM REVISION 1.3 # VDSM PRIVATE
defaults { polling_interval 5 no_path_retry fail user_friendly_names no flush_on_last_del yes fast_io_fail_tmo 5 dev_loss_tmo 30 max_fds 4096 }
# Remove devices entries when overrides section is available. devices { device { # These settings overrides built-in devices settings. It does not apply # to devices without built-in settings (these use the settings in the # "defaults" section), or to devices defined in the "devices" section. # Note: This is not available yet on Fedora 21. For more info see # https://bugzilla.redhat.com/1253799 all_devs yes no_path_retry fail } device { vendor "IBM" product "^1814" product_blacklist "Universal Xport" path_grouping_policy "group_by_prio" path_checker "rdac" features "0" hardware_handler "1 rdac" prio "rdac" failback immediate rr_weight "uniform" no_path_retry "12"
Hi Gianluca, This should be a number, not a string, maybe multipath is having trouble parsing this and it ignores your value?
} }
So I put exactly the default device config for my IBM/1814 device but no_path_retry set to 12.
Why 12? This will do 12 retries, 5 seconds each when no path is available. This will block lvm commands for 60 seconds when no path is available, blocking other stuff in vdsm. Vdsm is not designed to handle this. I recommend value of 4. But note that this will is not related to the fact that your devices are not initialize properly after boot.
In CentOS 6.x when you do something like this, "show config" gives you the modified entry only for your device section. Instead in CentOS 7.3 it seems I get anyway the default one for IBM/1814 and also the customized one at the end of the output....
Maybe your device configuration does not match exactly the builtin config.
Two facts: - before I could reproduce the problem if I selected Maintenance Power Mgmt ---> Restart (tried 3 times with same behavior)
Instead if I executed in separate steps Maintenance Power Mgmt --> Stop wait a moment Power Mgmt --> Start
I didn't get problems (tried only one time...)
Maybe waiting a moment helps the storage/switches to clean up properly after a server is shut down? Does your power management trigger a proper shutdown? I would avoid using it for normal shutdown.
With this "new" multipath config (to be confirmed if in effect, how?) I don't get the VM paused problem even with Restart option of Power Mgmt In active host messages I see these ones when the other reboots:
Jan 31 16:50:01 ovmsrv06 systemd: Started Session 705 of user root. Jan 31 16:50:01 ovmsrv06 systemd: Starting Session 705 of user root. Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sde - rdac checker reports path is up Jan 31 16:53:47 ovmsrv06 multipathd: 8:64: reinstated Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1] Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdo - rdac checker reports path is ghost Jan 31 16:53:47 ovmsrv06 multipathd: 8:224: reinstated Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdk - rdac checker reports path is up Jan 31 16:53:47 ovmsrv06 multipathd: 8:160: reinstated Jan 31 16:53:47 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdq - rdac checker reports path is ghost Jan 31 16:53:47 ovmsrv06 multipathd: 65:0: reinstated Jan 31 16:53:48 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT returned with sense 05/91/36 Jan 31 16:53:48 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT returned with sense 05/91/36 Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT completed Jan 31 16:53:49 ovmsrv06 kernel: sd 2:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command Jan 31 16:53:49 ovmsrv06 kernel: sd 2:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT completed Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sde - rdac checker reports path is ghost Jan 31 16:53:52 ovmsrv06 multipathd: 8:64: reinstated Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1] Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdo - rdac checker reports path is up Jan 31 16:53:52 ovmsrv06 multipathd: 8:224: reinstated Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdk - rdac checker reports path is ghost Jan 31 16:53:52 ovmsrv06 multipathd: 8:160: reinstated Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdq - rdac checker reports path is up Jan 31 16:53:52 ovmsrv06 multipathd: 65:0: reinstated
But they are not related to the multipath device dedicated to oVirt storage domain in this case.... What lets me be optimistic seems the difference in these lines:
before I got Jan 31 10:27:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 0 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1]
now I get Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1]
multipath 0 1 rdac vs multipath 1 queue_if_no_path 1 rdac
This is not expected, multipath is using unlimited queueing, which is the worst setup for ovirt. Maybe this is the result of using "12" instead of 12? Anyway, looking in multipath source, this is the default configuration for your device: 405 /* DS3950 / DS4200 / DS4700 / DS5020 */ 406 .vendor = "IBM", 407 .product = "^1814", 408 .bl_product = "Universal Xport", 409 .pgpolicy = GROUP_BY_PRIO, 410 .checker_name = RDAC, 411 .features = "2 pg_init_retries 50", 412 .hwhandler = "1 rdac", 413 .prio_name = PRIO_RDAC, 414 .pgfailback = -FAILBACK_IMMEDIATE, 415 .no_path_retry = 30, 416 }, and this is the commit that updated this (and other rdac devices): http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commit;h=c1ed393b... So I would try this configuration: device { vendor "IBM" product "^1814" # defaults from multipathd show config product_blacklist "Universal Xport" path_grouping_policy "group_by_prio" path_checker "rdac" hardware_handler "1 rdac" prio "rdac" failback immediate rr_weight "uniform" # Based on multipath commit c1ed393b91acace284901f16954ba5c1c0d943c9 features "2 pg_init_retries 50" # Default is 30 seconds, ovirt recommended value is 4 to avoid # blocking in vdsm. This gives 20 seconds (4 * polling_interval) # gracetime when no path is available. no_path_retry 4 } Ben, do you have any other ideas on debugging this issue and improving multipath configuration? Nir

On Wed, Feb 1, 2017 at 8:39 AM, Nir Soffer <nsoffer@redhat.com> wrote:
Hi Gianluca,
This should be a number, not a string, maybe multipath is having trouble parsing this and it ignores your value?
I don't think so. Also because reading dm multipath guide at https://access.redhat.com/documentation/en-US/Red_Hat_Enterp rise_Linux/7/html/DM_Multipath/multipath_config_confirm.html It seems that in RH EL 7.3 the "show config" command has this behaviour: " For example, the following command sequence displays the multipath configuration, including the defaults, before exiting the console. # multipathd -k
show config CTRL-D "
So the output has to include the default too. Anyway I changed it, see below <<<<< BEGIN OF PARENTHESIS In theory it should be the same on RH EL 6.8 (see https://access.redhat.com/documentation/en-US/Red_Hat_Enterp rise_Linux/6/html/DM_Multipath/multipath_config_confirm.html ) but it is not so for me on a system that is on 6.5, with device-mapper-multipath-0.4.9-93.el6.x86_64 and connected to Netapp In /usr/share/doc/device-mapper-multipath-0.4.9/multipath.conf.defaults # vendor "NETAPP" # product "LUN.*" # path_grouping_policy group_by_prio # getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n" # path_selector "round-robin 0" # path_checker tur # features "3 queue_if_no_path pg_init_retries 50" # hardware_handler "0" # prio ontap # failback immediate # rr_weight uniform # rr_min_io 128 # rr_min_io_rq 1 # flush_on_last_del yes # fast_io_fail_tmo 5 # dev_loss_tmo infinity # retain_attached_hw_handler yes # detect_prio yes # reload_readwrite yes # } My customization in multipath.conf, based on Netapp guidelines and my Netapp storage array setup: devices { device { vendor "NETAPP" product "LUN.*" getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n" hardware_handler "1 alua" prio alua } If I run "multipathd show config" I see only 1 entry for NETAPP/LUN vendor/product and it is a merge of default and my custom. device { vendor "NETAPP" product "LUN.*" path_grouping_policy group_by_prio getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n" path_selector "round-robin 0" path_checker tur features "3 queue_if_no_path pg_init_retries 50" hardware_handler "1 alua" prio alua failback immediate rr_weight uniform rr_min_io 128 rr_min_io_rq 1 flush_on_last_del yes fast_io_fail_tmo 5 dev_loss_tmo infinity retain_attached_hw_handler yes detect_prio yes reload_readwrite yes } So this difference confused me when configuring multipath in CentOS 7.3. I have to see when I'm going to update from 6.5 to 6.8 if this changes. <<<<< END OF PARENTHESIS
} }
So I put exactly the default device config for my IBM/1814 device but no_path_retry set to 12.
Why 12?
This will do 12 retries, 5 seconds each when no path is available. This will block lvm commands for 60 seconds when no path is available, blocking other stuff in vdsm. Vdsm is not designed to handle this.
I recommend value of 4.
OK.
But note that this will is not related to the fact that your devices are not initialize properly after boot.
In fact it could be also a ds4700 overall problem.... The LUNs are configured as LNX CLUSTER type, that should be ok in theory, even if this kind of storage was never so supported with Linux. Initially one had to use proprietary IBM kernel modules/drivers. I will see consistency and robustness through testing. I have to do a POC and this is the hw I have and I should at least try to have a working solution for it.
In CentOS 6.x when you do something like this, "show config" gives you the modified entry only for your device section. Instead in CentOS 7.3 it seems I get anyway the default one for IBM/1814 and also the customized one at the end of the output....
Maybe your device configuration does not match exactly the builtin config.
I think it is the different behaviour as outlined above. I think you can confirm in another system where some customization has been done too...
Maybe waiting a moment helps the storage/switches to clean up properly after a server is shut down?
I think so too. Eventually when possible, if errors repeat with the new config, I'll manage to do stop/start instead of restart
Does your power management trigger a proper shutdown? I would avoid using it for normal shutdown.
I have not understood what you mean exactly here... Can you elaborate? Suppose I have to power off one hypervisor (yum update, pathing, fw update or planned server room maintenance, ...), my workflow is this one all from inside web admin gui: Move running VMs in charge of the host (or delegate to the following step) Put host into maintenance Power Mgmt --> Stop When planned maintenance ha finished Power Mgmt --> Start I should see the host in maintenance Activate Or do you mean I should do anything from the host itself and not the GUI?
multipath 0 1 rdac vs multipath 1 queue_if_no_path 1 rdac
This is not expected, multipath is using unlimited queueing, which is the worst setup for ovirt.
Maybe this is the result of using "12" instead of 12?
Anyway, looking in multipath source, this is the default configuration for your device:
405 /* DS3950 / DS4200 / DS4700 / DS5020 */ 406 .vendor = "IBM", 407 .product = "^1814", 408 .bl_product = "Universal Xport", 409 .pgpolicy = GROUP_BY_PRIO, 410 .checker_name = RDAC, 411 .features = "2 pg_init_retries 50", 412 .hwhandler = "1 rdac", 413 .prio_name = PRIO_RDAC, 414 .pgfailback = -FAILBACK_IMMEDIATE, 415 .no_path_retry = 30, 416 },
and this is the commit that updated this (and other rdac devices): http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=c ommit;h=c1ed393b91acace284901f16954ba5c1c0d943c9
So I would try this configuration:
device { vendor "IBM" product "^1814"
# defaults from multipathd show config product_blacklist "Universal Xport" path_grouping_policy "group_by_prio" path_checker "rdac" hardware_handler "1 rdac" prio "rdac" failback immediate rr_weight "uniform"
# Based on multipath commit c1ed393b91acace284901f16954ba5c1c0d943c9 features "2 pg_init_retries 50"
# Default is 30 seconds, ovirt recommended value is 4 to avoid # blocking in vdsm. This gives 20 seconds (4 * polling_interval) # gracetime when no path is available. no_path_retry 4 }
Ben, do you have any other ideas on debugging this issue and improving multipath configuration?
Nir
OK. In the mean time I have applied your suggested config and restarted the 2 nodes. Let we test and see if I find any problems running also some I/O tests. Thanks in the mean time, Gianluca

On Wed, Feb 1, 2017 at 8:22 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
OK. In the mean time I have applied your suggested config and restarted the 2 nodes. Let we test and see if I find any problems running also some I/O tests. Thanks in the mean time, Gianluca
Quick test without much success Inside the guest I run this loop while true do time dd if=/dev/zero bs=1024k count=1024 of=/home/g.cecchi/testfile sleep 5 done BTW: my home is inside / filesystem on guest that has space to accomodate 1Gb of the dd command: [g.cecchi@ol65 ~]$ df -h /home/g.cecchi/ Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_ol65-lv_root 20G 4.9G 14G 27% / [g.cecchi@ol65 ~]$ After about 7 rounds I get this in messages of the host where the VM is running: Feb 1 23:31:39 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:39 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:44 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:45 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:45 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:47 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:50 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:50 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:56 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:57 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Nothing on the other host. In web admin events pane: Feb 1, 2017 11:31:44 PM VM ol65 has been paused due to no Storage space error. Feb 1, 2017 11:31:44 PM VM ol65 has been paused. I stop the dd loop and after some seconds: Feb 1, 2017 11:32:32 PM VM ol65 has recovered from paused back to up Multipath status for my device: 3600a0b8000299aa80000d08b55014119 dm-2 IBM ,1814 FAStT size=4.0T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw |-+- policy='service-time 0' prio=0 status=active | |- 0:0:1:3 sdj 8:144 active undef running | `- 2:0:1:3 sdp 8:240 active undef running `-+- policy='service-time 0' prio=0 status=enabled |- 0:0:0:3 sdd 8:48 active undef running `- 2:0:0:3 sdk 8:160 active undef running In engine.log 2017-02-01 23:22:01,449 INFO [org.ovirt.engine.core.bll.aaa.CreateUserSessionCommand] (default task-15) [530aee87] Running command: CreateUserSessionCommand internal: false. 2017-02-01 23:22:04,011 INFO [org.ovirt.engine.docs.utils.servlet.ContextSensitiveHelpMappingServlet] (default task-12) [] Context-sensitive help is not installed. Manual directory doesn't exist: /usr/share/ovirt-engine/manual 2017-02-01 23:31:43,936 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-10) [10c15e39] VM '932db7c7-4121-4cbe-ad8d-09e4e99b3cdd'(ol65) moved from 'Up' --> 'Paused' 2017-02-01 23:31:44,087 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-10) [10c15e39] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM ol65 has been paused. 2017-02-01 23:31:44,227 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-10) [10c15e39] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM ol65 has been paused due to no Storage space error. 2017-02-01 23:32:32,100 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (DefaultQuartzScheduler5) [1c02a371] VM '932db7c7-4121-4cbe-ad8d-09e4e99b3cdd'(ol65) moved from 'Paused' --> 'Up' 2017-02-01 23:32:32,259 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler5) [1c02a371] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM ol65 has recovered from paused back to up. In vdsm.log of the host where the VM is running I see this: mailbox.SPMMonitor::DEBUG::2017-02-01 23:31:37,729::storage_mailbox: :733::Storage.Misc.excCmd::(_chec kForMail) /usr/bin/taskset --cpu-list 0-7 dd if=/rhev/data-center/588237b8- 0031-02f6-035d-00000000013 6/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=1024000 (cwd None) mailbox.SPMMonitor::DEBUG::2017-02-01 23:31:37,764::storage_mailbox: :733::Storage.Misc.excCmd::(_chec kForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.0061866 s, 166 MB/s\n'; <rc> = 0 periodic/8::INFO::2017-02-01 23:31:39,699::vm::1028::virt.vm::(extendDrivesIfNeeded) vmId=`932db7c7-4 121-4cbe-ad8d-09e4e99b3cdd`::Requesting extension for volume 66615920-619f-4f52-ad5a-3f062c054094 on domain 922b5269-ab56-4c4d-838f-49d33427e2ab (apparent: 12884901888, capacity: 96636764160, allocated: 12355698688, physical: 12884901888) periodic/8::DEBUG::2017-02-01 23:31:39,699::vm::1093::virt.vm::(__extendDriveVolume) vmId=`932db7c7-4 121-4cbe-ad8d-09e4e99b3cdd`::Requesting an extension for the volume: {'newSize': 13958643712, 'domain ID': u'922b5269-ab56-4c4d-838f-49d33427e2ab', 'name': u'vda', 'volumeID': u'66615920-619f-4f52-ad5a-3 f062c054094', 'imageID': u'32b0e94d-0c8e-4d3d-a4eb-448885297e93', 'internal': False, 'poolID': u'5882 37b8-0031-02f6-035d-000000000136'} periodic/8::DEBUG::2017-02-01 23:31:39,700::task::597:: Storage.TaskManager.Task::(_updateState) Task= `d5169f55-8fa6-494e-bf3c-b077b922e06c`::moving from state init -> state preparing periodic/8::INFO::2017-02-01 23:31:39,700::logUtils::49::dispatcher::(wrapper) Run and protect: sendExtendMsg(spUUID=u'588237b8-0031-02f6-035d-000000000136', volDict={'newSize': 13958643712, 'domainID': u'922b5269-ab56-4c4d-838f-49d33427e2ab', 'name': u'vda', 'volumeID': u'66615920-619f-4f52-ad5a-3f062c054094', 'imageID': u'32b0e94d-0c8e-4d3d-a4eb-448885297e93', 'internal': False, 'poolID': u'588237b8-0031-02f6-035d-000000000136'}, newSize=13958643712, callbackFunc=<bound method Vm.__afterVolumeExtension of <virt.vm.Vm object at 0x2feb5d0>>) periodic/8::DEBUG::2017-02-01 23:31:39,701::storage_mailbox: :125::Storage.SPM.Messages.Extend::(__init__) new extend msg created: domain: 922b5269-ab56-4c4d-838f-49d33427e2ab, volume: 66615920-619f-4f52-ad5a-3f062c054094 periodic/8::INFO::2017-02-01 23:31:39,701::logUtils::52::dispatcher::(wrapper) Run and protect: sendExtendMsg, Return response: None periodic/8::DEBUG::2017-02-01 23:31:39,701::task::1193:: Storage.TaskManager.Task::(prepare) Task=`d5169f55-8fa6-494e-bf3c-b077b922e06c`::finished: None periodic/8::DEBUG::2017-02-01 23:31:39,701::task::597:: Storage.TaskManager.Task::(_updateState) Task=`d5169f55-8fa6-494e-bf3c-b077b922e06c`::moving from state preparing -> state finished periodic/8::DEBUG::2017-02-01 23:31:39,701::resourceManager: :952::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} periodic/8::DEBUG::2017-02-01 23:31:39,701::resourceManager: :989::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} periodic/8::DEBUG::2017-02-01 23:31:39,701::task::995:: Storage.TaskManager.Task::(_decref) Task=`d5169f55-8fa6-494e-bf3c-b077b922e06c`::ref 0 aborting False mailbox.HSMMonitor::DEBUG::2017-02-01 23:31:39,702::storage_mailbox: :428::Storage.MailBox.HsmMailMonitor::(_handleMessage) HSM_MailMonitor - start: 64, end: 128, len: 4096, message(1/63): "1xtnd\xab\xe2'4\xd3I\x8f\ x83MLV\xabiR+\x92\x94@\x05,\x06?Z\xadRO\x9fa Yaf000000000000340000000000000" mailbox.HSMMonitor::DEBUG::2017-02-01 23:31:39,702::storage_mailbox: :375::Storage.Misc.excCmd::(_checkForMail) /usr/bin/taskset --cpu-list 0-7 /usr/bin/dd if=/rhev/data-center/588237b8-0031-02f6-035d-00 0000000136/mastersd/dom_md/outbox iflag=direct,fullblock bs=512 count=8 skip=16 (cwd None) mailbox.SPMMonitor::DEBUG::2017-02-01 23:31:39,790::storage_mailbox: :733::Storage.Misc.excCmd::(_checkForMail) /usr/bin/taskset --cpu-list 0-7 dd if=/rhev/data-center/588237b8-0031-02f6-035d-000000000136/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=1024000 (cwd None) mailbox.HSMMonitor::DEBUG::2017-02-01 23:31:39,815::storage_mailbox: :375::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '8+0 records in\n8+0 records out\n4096 bytes (4.1 kB) copied, 0.0900633 s, 45.5 kB/s\n'; <rc> = 0 mailbox.HSMMonitor::INFO::2017-02-01 23:31:39,816::storage_mailbox: :387::Storage.MailBox.HsmMailMonitor::(_sendMail) HSM_MailMonitor sending mail to SPM - ['/usr/bin/dd', 'of=/rhev/data-center/588237b8-0031-02f6-035d- 000000000136/mastersd/dom_md/inbox', 'iflag=fullblock', 'oflag=direct', 'conv=notrunc', 'bs=512', 'seek=16'] mailbox.HSMMonitor::DEBUG::2017-02-01 23:31:39,816::storage_mailbox: :86::Storage.Misc.excCmd::(_mboxExecCmd) /usr/bin/taskset --cpu-list 0-7 /usr/bin/dd of=/rhev/data-center/588237b8-0031-02f6-035d-000000000136/mastersd/dom_md/inbox iflag=fullblock oflag=direct conv=notrunc bs=512 seek=16 (cwd None) mailbox.SPMMonitor::DEBUG::2017-02-01 23:31:39,844::storage_mailbox: :733::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.0290379 s, 35.3 MB/s\n'; <rc> = 0 periodic/7::INFO::2017-02-01 23:31:41,700::vm::1028::virt.vm::(extendDrivesIfNeeded) vmId=`932db7c7-4121-4cbe-ad8d-09e4e99b3cdd`::Requesting extension for volume 66615920-619f-4f52-ad5a-3f062c054094 on domain 922b5269-ab56-4c4d-838f-49d33427e2ab (apparent: 12884901888, capacity: 96636764160, allocated: 12617187328, physical: 12884901888) ... mailbox.SPMMonitor::DEBUG::2017-02-01 23:31:42,181::storage_mailbox: :733::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.30031 s, 3.4 MB/s\n'; <rc> = 0 mailbox.SPMMonitor::ERROR::2017-02-01 23:31:42,182::storage_mailbox: :619::Storage.MailBox.SpmMailMonitor::(_validateMailbox) SPM_MailMonitor: mailbox 2 checksum failed, not clearing mailbox, clearing newMail. jsonrpc.Executor/6::DEBUG::2017-02-01 23:31:42,587::__init__::530:: jsonrpc.JsonRpcServer::(_handle_request) Calling 'Host.getAllVmStats' in bridge with {} jsonrpc.Executor/6::DEBUG::2017-02-01 23:31:42,589::__init__::555:: jsonrpc.JsonRpcServer::(_handle_request) Return 'Host.getAllVmStats' in bridge with (suppressed) jsonrpc.Executor/6::INFO::2017-02-01 23:31:42,590::__init__::513:: jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getAllVmStats succeeded in 0.00 seconds libvirtEventLoop::INFO::2017-02-01 23:31:43,586::vm::4041::virt.vm::(onIOError) vmId=`932db7c7-4121-4cbe-ad8d-09e4e99b3cdd`::abnormal vm stop device virtio-disk0 error enospc libvirtEventLoop::INFO::2017-02-01 23:31:43,586::vm::4877::virt.vm::(_logGuestCpuStatus) vmId=`932db7c7-4121-4cbe-ad8d-09e4e99b3cdd`::CPU stopped: onIOError periodic/6::WARNING::2017-02-01 23:31:43,698::periodic::277:: virt.periodic.VmDispatcher::(__call__) could not run <class 'vdsm.virt.periodic.DriveWatermarkMonitor'> on [u'932db7c7-4121-4cbe-ad8d- 09e4e99b3cdd'] mailbox.HSMMonitor::DEBUG::2017-02-01 23:31:43,790::storage_mailbox: :86::Storage.Misc.excCmd::(_mboxExecCmd) SUCCESS: <err> = '8+0 records in\n8+0 records out\n4096 bytes (4.1 kB) copied, 3.95001 s, 1.0 kB/s\n'; <rc> = 0 The disk of the VM is 90Gb in virtual size, thin provisioned Output of the commands inside guest until Ctrl+C (I interrupted the lop as I received the VM paused error...): [g.cecchi@ol65 ~]$ while true; do time dd if=/dev/zero bs=1024k count=1024 of=/home/g.cecchi/testfile; sleep 5; done 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 9.27353 s, 116 MB/s real 0m9.353s user 0m0.007s sys 0m3.174s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 9.68965 s, 111 MB/s real 0m9.929s user 0m0.000s sys 0m2.972s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 9.16642 s, 117 MB/s real 0m9.412s user 0m0.008s sys 0m2.766s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 2.68209 s, 400 MB/s real 0m2.931s user 0m0.004s sys 0m2.927s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 6.11601 s, 176 MB/s real 0m9.158s user 0m0.002s sys 0m2.922s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 2.67673 s, 401 MB/s real 0m2.921s user 0m0.004s sys 0m2.916s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 4.10537 s, 262 MB/s real 0m6.885s user 0m0.009s sys 0m2.984s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 5.56343 s, 193 MB/s real 0m7.866s user 0m0.004s sys 0m3.016s ^C For the VM disk I get this info: [root@ovmsrv06 vdsm]# qemu-img info /rhev/data-center/mnt/blockSD/922b5269-ab56-4c4d-838f-49d33427e2ab/images/32b0e94d-0c8e-4d3d-a4eb-448885297e93/66615920-619f-4f52-ad5a-3f062c054094 image: /rhev/data-center/mnt/blockSD/922b5269-ab56-4c4d-838f-49d33427e2ab/images/32b0e94d-0c8e-4d3d-a4eb-448885297e93/66615920-619f-4f52-ad5a-3f062c054094 file format: qcow2 virtual size: 90G (96636764160 bytes) disk size: 0 cluster_size: 65536 Format specific information: compat: 0.10 refcount bits: 16 [root@ovmsrv06 vdsm]# Apparently disk size 0????? In cat /rhev/data-center/mnt/blockSD/922b5269-ab56-4c4d-838f-49d33427e2ab/dom_md/metadata I get EOF DOMAIN=922b5269-ab56-4c4d-838f-49d33427e2ab CTIME=1485272737 FORMAT=COW DISKTYPE=2 LEGALITY=LEGAL SIZE=188743680 VOLTYPE=LEAF DESCRIPTION={"DiskAlias":"ol65_Disk1","DiskDescription":""} IMAGE=32b0e94d-0c8e-4d3d-a4eb-448885297e93 PUUID=00000000-0000-0000-0000-000000000000 MTIME=0 POOL_UUID= TYPE=SPARSE EOF

On Thu, Feb 2, 2017 at 1:11 AM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Wed, Feb 1, 2017 at 8:22 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
OK. In the mean time I have applied your suggested config and restarted the 2 nodes. Let we test and see if I find any problems running also some I/O tests. Thanks in the mean time, Gianluca
Quick test without much success
Inside the guest I run this loop while true do time dd if=/dev/zero bs=1024k count=1024 of=/home/g.cecchi/testfile sleep 5 done
I don't think this test is related to the issues you reported earlier. What you test here is how fast ovirt thin provisioning can extend your disk when writing zeros. We don't handle that very well, each extend needs at couple 4-6 seconds to complete before we refresh the lv on the host running the vm, and this is the *best* case. In the worst case it can take much longer. Also what you tested here is how fast you can write to your vm buffer cache, since you are not using direct io. A better way to perform this test is: time dd if=/dev/zero bs=1024k count=1024 of=/home/g.cecchi/testfile oflag=direct This will give you the time to actually write data to storage. If you have a real issue with vms pausing during writes when vm disk has to be extended, you can enlarge the extend chunk, 1GiB by default. To use chunks of 2GiB, set: [irs] volume_utilization_percent = 50 volume_utilization_chunk_mb = 2048 This will extend the drive when free space is less than 1024MiB (volume_utilization_chunk_mb * (100 - volume_utilization_percent) / 100) If this is not enough, you can also use lower volume_utilization_percent, for example, this will extend the disk in 2GiB chunks when free space is bellow 1536MiB: [irs] volume_utilization_percent = 25 volume_utilization_chunk_mb = 2048
BTW: my home is inside / filesystem on guest that has space to accomodate 1Gb of the dd command: [g.cecchi@ol65 ~]$ df -h /home/g.cecchi/ Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_ol65-lv_root 20G 4.9G 14G 27% / [g.cecchi@ol65 ~]$
After about 7 rounds I get this in messages of the host where the VM is running:
Feb 1 23:31:39 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:39 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:44 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:45 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:45 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:47 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:50 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:50 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:56 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:57 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition!
This is interesting, we have seen this messages before, but could never detect the flow causing them, are you sure you see this each time you extend your disk? If you can reproduce this, please file a bug.
Nothing on the other host. In web admin events pane: Feb 1, 2017 11:31:44 PM VM ol65 has been paused due to no Storage space error. Feb 1, 2017 11:31:44 PM VM ol65 has been paused.
I stop the dd loop and after some seconds: Feb 1, 2017 11:32:32 PM VM ol65 has recovered from paused back to up
Multipath status for my device:
3600a0b8000299aa80000d08b55014119 dm-2 IBM ,1814 FAStT size=4.0T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw |-+- policy='service-time 0' prio=0 status=active | |- 0:0:1:3 sdj 8:144 active undef running | `- 2:0:1:3 sdp 8:240 active undef running `-+- policy='service-time 0' prio=0 status=enabled |- 0:0:0:3 sdd 8:48 active undef running `- 2:0:0:3 sdk 8:160 active undef running
In engine.log
2017-02-01 23:22:01,449 INFO [org.ovirt.engine.core.bll.aaa.CreateUserSessionCommand] (default task-15) [530aee87] Running command: CreateUserSessionCommand internal: false. 2017-02-01 23:22:04,011 INFO [org.ovirt.engine.docs.utils.servlet.ContextSensitiveHelpMappingServlet] (default task-12) [] Context-sensitive help is not installed. Manual directory doesn't exist: /usr/share/ovirt-engine/manual 2017-02-01 23:31:43,936 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-10) [10c15e39] VM '932db7c7-4121-4cbe-ad8d-09e4e99b3cdd'(ol65) moved from 'Up' --> 'Paused' 2017-02-01 23:31:44,087 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-10) [10c15e39] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM ol65 has been paused. 2017-02-01 23:31:44,227 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-10) [10c15e39] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM ol65 has been paused due to no Storage space error. 2017-02-01 23:32:32,100 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (DefaultQuartzScheduler5) [1c02a371] VM '932db7c7-4121-4cbe-ad8d-09e4e99b3cdd'(ol65) moved from 'Paused' --> 'Up' 2017-02-01 23:32:32,259 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler5) [1c02a371] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM ol65 has recovered from paused back to up.
In vdsm.log of the host where the VM is running I see this:
mailbox.SPMMonitor::DEBUG::2017-02-01 23:31:37,729::storage_mailbox::733::Storage.Misc.excCmd::(_chec kForMail) /usr/bin/taskset --cpu-list 0-7 dd if=/rhev/data-center/588237b8-0031-02f6-035d-00000000013 6/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=1024000 (cwd None) mailbox.SPMMonitor::DEBUG::2017-02-01 23:31:37,764::storage_mailbox::733::Storage.Misc.excCmd::(_chec kForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.0061866 s, 166 MB/s\n'; <rc> = 0 periodic/8::INFO::2017-02-01 23:31:39,699::vm::1028::virt.vm::(extendDrivesIfNeeded) vmId=`932db7c7-4 121-4cbe-ad8d-09e4e99b3cdd`::Requesting extension for volume 66615920-619f-4f52-ad5a-3f062c054094 on domain 922b5269-ab56-4c4d-838f-49d33427e2ab (apparent: 12884901888, capacity: 96636764160, allocated: 12355698688, physical: 12884901888) periodic/8::DEBUG::2017-02-01 23:31:39,699::vm::1093::virt.vm::(__extendDriveVolume) vmId=`932db7c7-4 121-4cbe-ad8d-09e4e99b3cdd`::Requesting an extension for the volume: {'newSize': 13958643712, 'domain ID': u'922b5269-ab56-4c4d-838f-49d33427e2ab', 'name': u'vda', 'volumeID': u'66615920-619f-4f52-ad5a-3 f062c054094', 'imageID': u'32b0e94d-0c8e-4d3d-a4eb-448885297e93', 'internal': False, 'poolID': u'5882 37b8-0031-02f6-035d-000000000136'} periodic/8::DEBUG::2017-02-01 23:31:39,700::task::597::Storage.TaskManager.Task::(_updateState) Task= `d5169f55-8fa6-494e-bf3c-b077b922e06c`::moving from state init -> state preparing periodic/8::INFO::2017-02-01 23:31:39,700::logUtils::49::dispatcher::(wrapper) Run and protect: sendExtendMsg(spUUID=u'588237b8-0031-02f6-035d-000000000136', volDict={'newSize': 13958643712, 'domainID': u'922b5269-ab56-4c4d-838f-49d33427e2ab', 'name': u'vda', 'volumeID': u'66615920-619f-4f52-ad5a-3f062c054094', 'imageID': u'32b0e94d-0c8e-4d3d-a4eb-448885297e93', 'internal': False, 'poolID': u'588237b8-0031-02f6-035d-000000000136'}, newSize=13958643712, callbackFunc=<bound method Vm.__afterVolumeExtension of <virt.vm.Vm object at 0x2feb5d0>>) periodic/8::DEBUG::2017-02-01 23:31:39,701::storage_mailbox::125::Storage.SPM.Messages.Extend::(__init__) new extend msg created: domain: 922b5269-ab56-4c4d-838f-49d33427e2ab, volume: 66615920-619f-4f52-ad5a-3f062c054094 periodic/8::INFO::2017-02-01 23:31:39,701::logUtils::52::dispatcher::(wrapper) Run and protect: sendExtendMsg, Return response: None periodic/8::DEBUG::2017-02-01 23:31:39,701::task::1193::Storage.TaskManager.Task::(prepare) Task=`d5169f55-8fa6-494e-bf3c-b077b922e06c`::finished: None periodic/8::DEBUG::2017-02-01 23:31:39,701::task::597::Storage.TaskManager.Task::(_updateState) Task=`d5169f55-8fa6-494e-bf3c-b077b922e06c`::moving from state preparing -> state finished periodic/8::DEBUG::2017-02-01 23:31:39,701::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} periodic/8::DEBUG::2017-02-01 23:31:39,701::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} periodic/8::DEBUG::2017-02-01 23:31:39,701::task::995::Storage.TaskManager.Task::(_decref) Task=`d5169f55-8fa6-494e-bf3c-b077b922e06c`::ref 0 aborting False mailbox.HSMMonitor::DEBUG::2017-02-01 23:31:39,702::storage_mailbox::428::Storage.MailBox.HsmMailMonitor::(_handleMessage) HSM_MailMonitor - start: 64, end: 128, len: 4096, message(1/63): "1xtnd\xab\xe2'4\xd3I\x8f\x83MLV\xabiR+\x92\x94@\x05,\x06?Z\xadRO\x9fa Yaf000000000000340000000000000" mailbox.HSMMonitor::DEBUG::2017-02-01 23:31:39,702::storage_mailbox::375::Storage.Misc.excCmd::(_checkForMail) /usr/bin/taskset --cpu-list 0-7 /usr/bin/dd if=/rhev/data-center/588237b8-0031-02f6-035d-00 0000000136/mastersd/dom_md/outbox iflag=direct,fullblock bs=512 count=8 skip=16 (cwd None) mailbox.SPMMonitor::DEBUG::2017-02-01 23:31:39,790::storage_mailbox::733::Storage.Misc.excCmd::(_checkForMail) /usr/bin/taskset --cpu-list 0-7 dd if=/rhev/data-center/588237b8-0031-02f6-035d-000000000136/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=1024000 (cwd None) mailbox.HSMMonitor::DEBUG::2017-02-01 23:31:39,815::storage_mailbox::375::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '8+0 records in\n8+0 records out\n4096 bytes (4.1 kB) copied, 0.0900633 s, 45.5 kB/s\n'; <rc> = 0 mailbox.HSMMonitor::INFO::2017-02-01 23:31:39,816::storage_mailbox::387::Storage.MailBox.HsmMailMonitor::(_sendMail) HSM_MailMonitor sending mail to SPM - ['/usr/bin/dd', 'of=/rhev/data-center/588237b8-0031-02f6-035d-000000000136/mastersd/dom_md/inbox', 'iflag=fullblock', 'oflag=direct', 'conv=notrunc', 'bs=512', 'seek=16'] mailbox.HSMMonitor::DEBUG::2017-02-01 23:31:39,816::storage_mailbox::86::Storage.Misc.excCmd::(_mboxExecCmd) /usr/bin/taskset --cpu-list 0-7 /usr/bin/dd of=/rhev/data-center/588237b8-0031-02f6-035d-000000000136/mastersd/dom_md/inbox iflag=fullblock oflag=direct conv=notrunc bs=512 seek=16 (cwd None) mailbox.SPMMonitor::DEBUG::2017-02-01 23:31:39,844::storage_mailbox::733::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.0290379 s, 35.3 MB/s\n'; <rc> = 0 periodic/7::INFO::2017-02-01 23:31:41,700::vm::1028::virt.vm::(extendDrivesIfNeeded) vmId=`932db7c7-4121-4cbe-ad8d-09e4e99b3cdd`::Requesting extension for volume 66615920-619f-4f52-ad5a-3f062c054094 on domain 922b5269-ab56-4c4d-838f-49d33427e2ab (apparent: 12884901888, capacity: 96636764160, allocated: 12617187328, physical: 12884901888) ... mailbox.SPMMonitor::DEBUG::2017-02-01 23:31:42,181::storage_mailbox::733::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.30031 s, 3.4 MB/s\n'; <rc> = 0 mailbox.SPMMonitor::ERROR::2017-02-01 23:31:42,182::storage_mailbox::619::Storage.MailBox.SpmMailMonitor::(_validateMailbox) SPM_MailMonitor: mailbox 2 checksum failed, not clearing mailbox, clearing newMail. jsonrpc.Executor/6::DEBUG::2017-02-01 23:31:42,587::__init__::530::jsonrpc.JsonRpcServer::(_handle_request) Calling 'Host.getAllVmStats' in bridge with {} jsonrpc.Executor/6::DEBUG::2017-02-01 23:31:42,589::__init__::555::jsonrpc.JsonRpcServer::(_handle_request) Return 'Host.getAllVmStats' in bridge with (suppressed) jsonrpc.Executor/6::INFO::2017-02-01 23:31:42,590::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getAllVmStats succeeded in 0.00 seconds libvirtEventLoop::INFO::2017-02-01 23:31:43,586::vm::4041::virt.vm::(onIOError) vmId=`932db7c7-4121-4cbe-ad8d-09e4e99b3cdd`::abnormal vm stop device virtio-disk0 error enospc libvirtEventLoop::INFO::2017-02-01 23:31:43,586::vm::4877::virt.vm::(_logGuestCpuStatus) vmId=`932db7c7-4121-4cbe-ad8d-09e4e99b3cdd`::CPU stopped: onIOError periodic/6::WARNING::2017-02-01 23:31:43,698::periodic::277::virt.periodic.VmDispatcher::(__call__) could not run <class 'vdsm.virt.periodic.DriveWatermarkMonitor'> on [u'932db7c7-4121-4cbe-ad8d-09e4e99b3cdd'] mailbox.HSMMonitor::DEBUG::2017-02-01 23:31:43,790::storage_mailbox::86::Storage.Misc.excCmd::(_mboxExecCmd) SUCCESS: <err> = '8+0 records in\n8+0 records out\n4096 bytes (4.1 kB) copied, 3.95001 s, 1.0 kB/s\n'; <rc> = 0
The disk of the VM is 90Gb in virtual size, thin provisioned
Output of the commands inside guest until Ctrl+C (I interrupted the lop as I received the VM paused error...):
[g.cecchi@ol65 ~]$ while true; do time dd if=/dev/zero bs=1024k count=1024 of=/home/g.cecchi/testfile; sleep 5; done 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 9.27353 s, 116 MB/s
real 0m9.353s user 0m0.007s sys 0m3.174s
1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 9.68965 s, 111 MB/s
real 0m9.929s user 0m0.000s sys 0m2.972s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 9.16642 s, 117 MB/s
real 0m9.412s user 0m0.008s sys 0m2.766s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 2.68209 s, 400 MB/s
real 0m2.931s user 0m0.004s sys 0m2.927s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 6.11601 s, 176 MB/s
real 0m9.158s user 0m0.002s sys 0m2.922s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 2.67673 s, 401 MB/s
real 0m2.921s user 0m0.004s sys 0m2.916s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 4.10537 s, 262 MB/s
real 0m6.885s user 0m0.009s sys 0m2.984s 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 5.56343 s, 193 MB/s
real 0m7.866s user 0m0.004s sys 0m3.016s ^C
For the VM disk I get this info:
[root@ovmsrv06 vdsm]# qemu-img info /rhev/data-center/mnt/blockSD/922b5269-ab56-4c4d-838f-49d33427e2ab/images/32b0e94d-0c8e-4d3d-a4eb-448885297e93/66615920-619f-4f52-ad5a-3f062c054094 image: /rhev/data-center/mnt/blockSD/922b5269-ab56-4c4d-838f-49d33427e2ab/images/32b0e94d-0c8e-4d3d-a4eb-448885297e93/66615920-619f-4f52-ad5a-3f062c054094 file format: qcow2 virtual size: 90G (96636764160 bytes) disk size: 0 cluster_size: 65536 Format specific information: compat: 0.10 refcount bits: 16 [root@ovmsrv06 vdsm]#
Apparently disk size 0?????
Qemu probably cannot handle disk size on block devices.
In cat /rhev/data-center/mnt/blockSD/922b5269-ab56-4c4d-838f-49d33427e2ab/dom_md/metadata
I get EOF DOMAIN=922b5269-ab56-4c4d-838f-49d33427e2ab CTIME=1485272737 FORMAT=COW DISKTYPE=2 LEGALITY=LEGAL SIZE=188743680 VOLTYPE=LEAF DESCRIPTION={"DiskAlias":"ol65_Disk1","DiskDescription":""} IMAGE=32b0e94d-0c8e-4d3d-a4eb-448885297e93 PUUID=00000000-0000-0000-0000-000000000000 MTIME=0 POOL_UUID= TYPE=SPARSE EOF

On Thu, Feb 2, 2017 at 10:48 AM, Nir Soffer <nsoffer@redhat.com> wrote:
On Thu, Feb 2, 2017 at 1:11 AM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Wed, Feb 1, 2017 at 8:22 PM, Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:
OK. In the mean time I have applied your suggested config and restarted the 2 nodes. Let we test and see if I find any problems running also some I/O tests. Thanks in the mean time, Gianluca
Quick test without much success
Inside the guest I run this loop while true do time dd if=/dev/zero bs=1024k count=1024 of=/home/g.cecchi/testfile sleep 5 done
I don't think this test is related to the issues you reported earlier.
I thought the same too, and all related comments you wrote. I'm going to test the suggested modifications for chunks. In general do you recommend thin provisioning at all on SAN storage? I decided to switch to preallocated for further tests and confirm So I created a snapshot and then a clone of the VM, changing allocation policy of the disk to preallocated. So far so good. Feb 2, 2017 10:40:23 AM VM ol65preallocated creation has been completed. Feb 2, 2017 10:24:15 AM VM ol65preallocated creation was initiated by admin@internal-authz. Feb 2, 2017 10:22:31 AM Snapshot 'for cloning' creation for VM 'ol65' has been completed. Feb 2, 2017 10:22:31 AM Snapshot 'for cloning' creation for VM 'ol65' was initiated by admin@internal-authz. so the throughput seems ok based on this storage type (the LUNs are on RAID5 made with sata disks): 16 minutes to write 90Gb is about 96MBytes/s, what expected What I see in messages during the cloning phasefrom 10:24 to 10:40: Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:14 ovmsrv05 journal: vdsm root WARNING File: /rhev/data-center/588237b8-0031-02f6-035d-000000000136/922b5269-ab56-4c4d-838f-49d33427e2ab/images/9d1c977f-540d-436a-9d93-b1cb0816af2a/607dbf59-7d4d-4fc3-ae5f-e8824bf82648 already removed Feb 2 10:24:14 ovmsrv05 multipathd: dm-15: remove map (uevent) Feb 2 10:24:14 ovmsrv05 multipathd: dm-15: devmap not registered, can't remove Feb 2 10:24:14 ovmsrv05 multipathd: dm-15: remove map (uevent) Feb 2 10:24:17 ovmsrv05 kernel: blk_update_request: critical target error, dev dm-4, sector 44566529 Feb 2 10:24:17 ovmsrv05 kernel: dm-15: WRITE SAME failed. Manually zeroing. Feb 2 10:40:07 ovmsrv05 kernel: scsi_verify_blk_ioctl: 16 callbacks suppressed Feb 2 10:40:07 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:40:17 ovmsrv05 multipathd: dm-15: remove map (uevent) Feb 2 10:40:17 ovmsrv05 multipathd: dm-15: devmap not registered, can't remove Feb 2 10:40:17 ovmsrv05 multipathd: dm-15: remove map (uevent) Feb 2 10:40:22 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition!
After about 7 rounds I get this in messages of the host where the VM is running:
Feb 1 23:31:39 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:39 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:44 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:45 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:45 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:47 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:50 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:50 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:56 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:57 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition!
This is interesting, we have seen this messages before, but could never detect the flow causing them, are you sure you see this each time you extend your disk?
If you can reproduce this, please file a bug.
Ok, see also above the registered message during the clone phase. Gianluca

On Thu, Feb 2, 2017 at 12:04 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Thu, Feb 2, 2017 at 10:48 AM, Nir Soffer <nsoffer@redhat.com> wrote:
On Thu, Feb 2, 2017 at 1:11 AM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Wed, Feb 1, 2017 at 8:22 PM, Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:
OK. In the mean time I have applied your suggested config and restarted the 2 nodes. Let we test and see if I find any problems running also some I/O tests. Thanks in the mean time, Gianluca
Quick test without much success
Inside the guest I run this loop while true do time dd if=/dev/zero bs=1024k count=1024 of=/home/g.cecchi/testfile
A single 'dd' rarely saturates a high performance storage. There are better utilities to test ('fio' , 'vdbench' and 'ddpt' for example). It's also testing a very theoretical scenario - very rarely you write zeros and very rarely you write so much sequential IO, and with a fixed block size. So it's almost 'hero numbers'.
sleep 5
done
I don't think this test is related to the issues you reported earlier.
I thought the same too, and all related comments you wrote. I'm going to test the suggested modifications for chunks. In general do you recommend thin provisioning at all on SAN storage?
Depends on your SAN. On thin provisioned one (with potentially inline dedup and compression, such as XtremIO, Pure, Nimble and others) I don't see a great value in thin provisioning.
I decided to switch to preallocated for further tests and confirm So I created a snapshot and then a clone of the VM, changing allocation policy of the disk to preallocated. So far so good.
Feb 2, 2017 10:40:23 AM VM ol65preallocated creation has been completed. Feb 2, 2017 10:24:15 AM VM ol65preallocated creation was initiated by admin@internal-authz. Feb 2, 2017 10:22:31 AM Snapshot 'for cloning' creation for VM 'ol65' has been completed. Feb 2, 2017 10:22:31 AM Snapshot 'for cloning' creation for VM 'ol65' was initiated by admin@internal-authz.
so the throughput seems ok based on this storage type (the LUNs are on RAID5 made with sata disks): 16 minutes to write 90Gb is about 96MBytes/s, what expected
What is your expectation? Is it FC, iSCSI? How many paths? What is the IO scheduler in the VM? Is it using virtio-blk or virtio-SCSI? Y.
What I see in messages during the cloning phasefrom 10:24 to 10:40:
Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:14 ovmsrv05 journal: vdsm root WARNING File: /rhev/data-center/588237b8-0031-02f6-035d-000000000136/ 922b5269-ab56-4c4d-838f-49d33427e2ab/images/9d1c977f- 540d-436a-9d93-b1cb0816af2a/607dbf59-7d4d-4fc3-ae5f-e8824bf82648 already removed Feb 2 10:24:14 ovmsrv05 multipathd: dm-15: remove map (uevent) Feb 2 10:24:14 ovmsrv05 multipathd: dm-15: devmap not registered, can't remove Feb 2 10:24:14 ovmsrv05 multipathd: dm-15: remove map (uevent) Feb 2 10:24:17 ovmsrv05 kernel: blk_update_request: critical target error, dev dm-4, sector 44566529 Feb 2 10:24:17 ovmsrv05 kernel: dm-15: WRITE SAME failed. Manually zeroing. Feb 2 10:40:07 ovmsrv05 kernel: scsi_verify_blk_ioctl: 16 callbacks suppressed Feb 2 10:40:07 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:40:17 ovmsrv05 multipathd: dm-15: remove map (uevent) Feb 2 10:40:17 ovmsrv05 multipathd: dm-15: devmap not registered, can't remove Feb 2 10:40:17 ovmsrv05 multipathd: dm-15: remove map (uevent) Feb 2 10:40:22 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition!
After about 7 rounds I get this in messages of the host where the VM is running:
Feb 1 23:31:39 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:39 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:44 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:45 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:45 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:47 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:50 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:50 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:56 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:57 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition!
This is interesting, we have seen this messages before, but could never detect the flow causing them, are you sure you see this each time you extend your disk?
If you can reproduce this, please file a bug.
Ok, see also above the registered message during the clone phase. Gianluca
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Thu, Feb 2, 2017 at 12:09 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
I decided to switch to preallocated for further tests and confirm So I created a snapshot and then a clone of the VM, changing allocation policy of the disk to preallocated. So far so good.
Feb 2, 2017 10:40:23 AM VM ol65preallocated creation has been completed. Feb 2, 2017 10:24:15 AM VM ol65preallocated creation was initiated by admin@internal-authz. Feb 2, 2017 10:22:31 AM Snapshot 'for cloning' creation for VM 'ol65' has been completed. Feb 2, 2017 10:22:31 AM Snapshot 'for cloning' creation for VM 'ol65' was initiated by admin@internal-authz.
so the throughput seems ok based on this storage type (the LUNs are on RAID5 made with sata disks): 16 minutes to write 90Gb is about 96MBytes/s, what expected
What is your expectation? Is it FC, iSCSI? How many paths? What is the IO scheduler in the VM? Is it using virtio-blk or virtio-SCSI? Y.
Peak bandwith no more than 140 MBytes/s, based on storage capabilities, but I don't have to do a rude performance test. I need stability Hosts has a mezzanine dual-port HBA (4 Gbit); each HBA connected to a different FC-switch and the multipath connection has 2 active paths (one for each HBA). I confirm that with preallocated disk of the cloned VM I don't have indeed the previous problems. The same loop executed for about 66 times in a 10 minutes interval without any problem registered on hosts No message at all in /var/log/messages of both hosts. My storage domain not compromised It remains important the question about thin provisioning and SAN LUNs (aka with LVM based disks). In my opinion I shouldn't care of the kind of I/O made inside a VM and anyway it shouldn't interfere with my storage domain, bringing down completely my hosts/VMs. In theory there could be an application inside a VM that generates something similar to my loop and so would generate problems. For sure I can then notify VM responsible about his/her workload, but it should not compromise my virtual infrastructure I could have an RDBMS inside a VM and a user that creates a big datafile and that should imply many extend operations if the disk is thin provisioned.... What about [irs] values? Where are they located, in vdsm.conf? What are defaults for volume_utilization_percent and volume_utilization_chunk_mb? Did they change from 3.6 to 4.0 to 4.1? What I should do after changing them to make them active? Thanks in advance, Gianluca

On Thu, Feb 2, 2017 at 1:34 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Thu, Feb 2, 2017 at 12:09 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
I decided to switch to preallocated for further tests and confirm So I created a snapshot and then a clone of the VM, changing allocation policy of the disk to preallocated. So far so good.
Feb 2, 2017 10:40:23 AM VM ol65preallocated creation has been completed. Feb 2, 2017 10:24:15 AM VM ol65preallocated creation was initiated by admin@internal-authz. Feb 2, 2017 10:22:31 AM Snapshot 'for cloning' creation for VM 'ol65' has been completed. Feb 2, 2017 10:22:31 AM Snapshot 'for cloning' creation for VM 'ol65' was initiated by admin@internal-authz.
so the throughput seems ok based on this storage type (the LUNs are on RAID5 made with sata disks): 16 minutes to write 90Gb is about 96MBytes/s, what expected
What is your expectation? Is it FC, iSCSI? How many paths? What is the IO scheduler in the VM? Is it using virtio-blk or virtio-SCSI? Y.
Peak bandwith no more than 140 MBytes/s, based on storage capabilities, but I don't have to do a rude performance test. I need stability Hosts has a mezzanine dual-port HBA (4 Gbit); each HBA connected to a different FC-switch and the multipath connection has 2 active paths (one for each HBA).
I confirm that with preallocated disk of the cloned VM I don't have indeed the previous problems. The same loop executed for about 66 times in a 10 minutes interval without any problem registered on hosts No message at all in /var/log/messages of both hosts. My storage domain not compromised It remains important the question about thin provisioning and SAN LUNs (aka with LVM based disks). In my opinion I shouldn't care of the kind of I/O made inside a VM and anyway it shouldn't interfere with my storage domain, bringing down completely my hosts/VMs. In theory there could be an application inside a VM that generates something similar to my loop and so would generate problems. For sure I can then notify VM responsible about his/her workload, but it should not compromise my virtual infrastructure I could have an RDBMS inside a VM and a user that creates a big datafile and that should imply many extend operations if the disk is thin provisioned....
What about [irs] values? Where are they located, in vdsm.conf?
Yes but you should not modify them in vdsm.conf.
What are defaults for volume_utilization_percent and volume_utilization_chunk_mb? Did they change from 3.6 to 4.0 to 4.1?
No, the defaults did not change in the last 3.5 years. In 4.0 we introduced dropin support, and this is the recommend way to perform configuration changes. To change these values, you create a file at /etc/vdsm/vdsm.conf.d/50_my.conf The name of the file does not matter, vdsm will read all files in the vdsm.conf.d directory, sort them by name (this is way you should use 50_ prefix), and apply the changes to the configuration. In this file you put the sections and options you need, like: [irs] volume_utilization_percent = 25 volume_utilization_chunk_mb = 2048
What I should do after changing them to make them active?
Restart vdsm Using these method, you can provision the same file on all hosts using standard provisioning tools. It is not recommended to modify /etc/vdsm/vdsm.conf. If you do this you will have to manually merge changes from vdsm.conf.rpmnew after upgrading vdsm. Nir

On Thu, Feb 2, 2017 at 12:48 PM, Nir Soffer <nsoffer@redhat.com> wrote:
What about [irs] values? Where are they located, in vdsm.conf?
Yes but you should not modify them in vdsm.conf.
What are defaults for volume_utilization_percent and volume_utilization_chunk_mb? Did they change from 3.6 to 4.0 to 4.1?
No, the defaults did not change in the last 3.5 years.
OK. In a previous message of this thread you wrote that the default volume_utilization_chunk_mb is 1024 What about default for volume_utilization_percent? 50?
What I should do after changing them to make them active?
Restart vdsm
with host into maintenance mode or could it be active? Thanks

On Thu, Feb 2, 2017 at 4:16 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Thu, Feb 2, 2017 at 12:48 PM, Nir Soffer <nsoffer@redhat.com> wrote:
What about [irs] values? Where are they located, in vdsm.conf?
Yes but you should not modify them in vdsm.conf.
What are defaults for volume_utilization_percent and volume_utilization_chunk_mb? Did they change from 3.6 to 4.0 to 4.1?
No, the defaults did not change in the last 3.5 years.
OK. In a previous message of this thread you wrote that the default volume_utilization_chunk_mb is 1024 What about default for volume_utilization_percent? 50?
Yes, this should be a commented out option in /etc/vdsm/vdsm.conf. If you don't have a vdsm.conf file, or the file is empty, you can generate a new file like this: python /usr/lib64/python2.7/site-packages/vdsm/config.py > vdsm.conf.examle
What I should do after changing them to make them active?
Restart vdsm
with host into maintenance mode or could it be active?
Thanks

On Thu, Feb 2, 2017 at 3:30 PM, Nir Soffer <nsoffer@redhat.com> wrote:
If you don't have a vdsm.conf file, or the file is empty, you can generate a new file like this:
python /usr/lib64/python2.7/site-packages/vdsm/config.py > vdsm.conf.examle
thanks. It seems that the package python-libs-2.7.5-48.el7.x86_64 actually uses /usr/lib path instead of /usr/lib64... What worked python /usr/lib/python2.7/site-packages/vdsm/config.py > vdsm.conf.example
What I should do after changing them to make them active?
Restart vdsm
with host into maintenance mode or could it be active?
Thanks
Can you confirm that the host can be active when I restart vdsmd service? Thanks Gianluca

On Thu, Feb 2, 2017 at 4:42 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Thu, Feb 2, 2017 at 3:30 PM, Nir Soffer <nsoffer@redhat.com> wrote:
If you don't have a vdsm.conf file, or the file is empty, you can generate a new file like this:
python /usr/lib64/python2.7/site-packages/vdsm/config.py > vdsm.conf.examle
thanks. It seems that the package python-libs-2.7.5-48.el7.x86_64 actually uses /usr/lib path instead of /usr/lib64... What worked python /usr/lib/python2.7/site-packages/vdsm/config.py > vdsm.conf.example
Indeed, my mistake, we move do lib several versions ago.
What I should do after changing them to make them active?
Restart vdsm
with host into maintenance mode or could it be active?
Thanks
Can you confirm that the host can be active when I restart vdsmd service?
Sure. This may abort a storage operation if one is running when you restart vdsm, but vdsm is designed so you can restart or kill it safely. For example, if you abort a disk copy in the middle, the operation will fail and the destination disk will be deleted. If you want to avoid such issue, you can put a host to maintenance, but this requires migration of vms to other hosts. Nir

On Thu, Feb 2, 2017 at 3:51 PM, Nir Soffer <nsoffer@redhat.com> wrote:
Can you confirm that the host can be active when I restart vdsmd service?
Sure. This may abort a storage operation if one is running when you restart vdsm, but vdsm is designed so you can restart or kill it safely.
For example, if you abort a disk copy in the middle, the operation will fail and the destination disk will be deleted.
If you want to avoid such issue, you can put a host to maintenance, but this requires migration of vms to other hosts.
Nir
OK. Created 50_thin_block_extension_rules.conf under /etc/vdsm/vdsm.conf.d and restarted vdsmd One last (latest probably... ;-) question Is it expected that if I restart vdsmd on the host that is the SPM, then SPM is shifted to another node? Because when restarting vdsmd on the host that is not SPM I didn't get any message in web admin gui and restart of vdsmd itself was very fast. Instead on the host with SPM, the command took several seconds and I got these events Feb 2, 2017 4:01:23 PM Host ovmsrv05 power management was verified successfully. Feb 2, 2017 4:01:23 PM Status of host ovmsrv05 was set to Up. Feb 2, 2017 4:01:19 PM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Feb 2, 2017 4:01:18 PM Storage Pool Manager runs on Host ovmsrv06 (Address: ovmsrv06.datacenter.polimi.it). Feb 2, 2017 4:01:13 PM VDSM ovmsrv05 command failed: Recovering from crash or Initializing Feb 2, 2017 4:01:11 PM Host ovmsrv05 is initializing. Message: Recovering from crash or Initializing Feb 2, 2017 4:01:11 PM VDSM ovmsrv05 command failed: Recovering from crash or Initializing Feb 2, 2017 4:01:11 PM Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host ovmsrv05, Error: Recovering from crash or Initializing). Feb 2, 2017 4:01:11 PM VDSM ovmsrv05 command failed: Recovering from crash or Initializing Feb 2, 2017 4:01:05 PM Host ovmsrv05 is not responding. It will stay in Connecting state for a grace period of 80 seconds and after that an attempt to fence the host will be issued. Feb 2, 2017 4:01:05 PM Host ovmsrv05 is not responding. It will stay in Connecting state for a grace period of 80 seconds and after that an attempt to fence the host will be issued. Feb 2, 2017 4:01:05 PM VDSM ovmsrv05 command failed: Connection reset by peer Thanks Gianluca

On Thu, Feb 2, 2017 at 6:05 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Thu, Feb 2, 2017 at 3:51 PM, Nir Soffer <nsoffer@redhat.com> wrote:
Can you confirm that the host can be active when I restart vdsmd service?
Sure. This may abort a storage operation if one is running when you restart vdsm, but vdsm is designed so you can restart or kill it safely.
For example, if you abort a disk copy in the middle, the operation will fail and the destination disk will be deleted.
If you want to avoid such issue, you can put a host to maintenance, but this requires migration of vms to other hosts.
Nir
OK. Created 50_thin_block_extension_rules.conf under /etc/vdsm/vdsm.conf.d and restarted vdsmd
One last (latest probably... ;-) question Is it expected that if I restart vdsmd on the host that is the SPM, then SPM is shifted to another node?
Yes, engine will move spm to another host when spm fails, unless you disabled spm role for any other host (see host > spm tab).
Because when restarting vdsmd on the host that is not SPM I didn't get any message in web admin gui and restart of vdsmd itself was very fast. Instead on the host with SPM, the command took several seconds and I got these events
It is expected the restarting the spm is slower, but we need to see vdsm logs to understand why.
Feb 2, 2017 4:01:23 PM Host ovmsrv05 power management was verified successfully. Feb 2, 2017 4:01:23 PM Status of host ovmsrv05 was set to Up. Feb 2, 2017 4:01:19 PM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Feb 2, 2017 4:01:18 PM Storage Pool Manager runs on Host ovmsrv06 (Address: ovmsrv06.datacenter.polimi.it). Feb 2, 2017 4:01:13 PM VDSM ovmsrv05 command failed: Recovering from crash or Initializing Feb 2, 2017 4:01:11 PM Host ovmsrv05 is initializing. Message: Recovering from crash or Initializing Feb 2, 2017 4:01:11 PM VDSM ovmsrv05 command failed: Recovering from crash or Initializing Feb 2, 2017 4:01:11 PM Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host ovmsrv05, Error: Recovering from crash or Initializing). Feb 2, 2017 4:01:11 PM VDSM ovmsrv05 command failed: Recovering from crash or Initializing Feb 2, 2017 4:01:05 PM Host ovmsrv05 is not responding. It will stay in Connecting state for a grace period of 80 seconds and after that an attempt to fence the host will be issued. Feb 2, 2017 4:01:05 PM Host ovmsrv05 is not responding. It will stay in Connecting state for a grace period of 80 seconds and after that an attempt to fence the host will be issued. Feb 2, 2017 4:01:05 PM VDSM ovmsrv05 command failed: Connection reset by peer
It look like the engine discovered that the SPM was down, and reconnected. It is expected that changes in the spm status are detected early and engine is trying to recover the spm, the SPM role is critical in ovirt. Are you sure you did not get any message when restarting the other host? I would expect that engine detect and report a restart of all hosts. If you can reproduce this, restarting vdsm is not detected on engine and not reported in engine even log, please file a bug. Nir

On Thu, Feb 2, 2017 at 12:04 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Thu, Feb 2, 2017 at 10:48 AM, Nir Soffer <nsoffer@redhat.com> wrote:
On Thu, Feb 2, 2017 at 1:11 AM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Wed, Feb 1, 2017 at 8:22 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
OK. In the mean time I have applied your suggested config and restarted the 2 nodes. Let we test and see if I find any problems running also some I/O tests. Thanks in the mean time, Gianluca
Quick test without much success
Inside the guest I run this loop while true do time dd if=/dev/zero bs=1024k count=1024 of=/home/g.cecchi/testfile sleep 5 done
I don't think this test is related to the issues you reported earlier.
I thought the same too, and all related comments you wrote. I'm going to test the suggested modifications for chunks. In general do you recommend thin provisioning at all on SAN storage?
Only if your storage does no support thin provisioning, or you need snapshot support. If you don't need these feature, using raw will be much more reliable and faster. Even if you use raw, you can still perform live storage migration; we create a snapshot using qcow2 format, copy the base raw volume to another storage, and finally delete the snapshot on the destination storage. In the future (ovirt 5?) we would like to use only smart storage thin provisioning and snapshot support.
I decided to switch to preallocated for further tests and confirm So I created a snapshot and then a clone of the VM, changing allocation policy of the disk to preallocated. So far so good.
Feb 2, 2017 10:40:23 AM VM ol65preallocated creation has been completed. Feb 2, 2017 10:24:15 AM VM ol65preallocated creation was initiated by admin@internal-authz. Feb 2, 2017 10:22:31 AM Snapshot 'for cloning' creation for VM 'ol65' has been completed. Feb 2, 2017 10:22:31 AM Snapshot 'for cloning' creation for VM 'ol65' was initiated by admin@internal-authz.
so the throughput seems ok based on this storage type (the LUNs are on RAID5 made with sata disks): 16 minutes to write 90Gb is about 96MBytes/s, what expected
What I see in messages during the cloning phasefrom 10:24 to 10:40:
Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:13 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:24:14 ovmsrv05 journal: vdsm root WARNING File: /rhev/data-center/588237b8-0031-02f6-035d-000000000136/922b5269-ab56-4c4d-838f-49d33427e2ab/images/9d1c977f-540d-436a-9d93-b1cb0816af2a/607dbf59-7d4d-4fc3-ae5f-e8824bf82648 already removed Feb 2 10:24:14 ovmsrv05 multipathd: dm-15: remove map (uevent) Feb 2 10:24:14 ovmsrv05 multipathd: dm-15: devmap not registered, can't remove Feb 2 10:24:14 ovmsrv05 multipathd: dm-15: remove map (uevent) Feb 2 10:24:17 ovmsrv05 kernel: blk_update_request: critical target error, dev dm-4, sector 44566529 Feb 2 10:24:17 ovmsrv05 kernel: dm-15: WRITE SAME failed. Manually zeroing. Feb 2 10:40:07 ovmsrv05 kernel: scsi_verify_blk_ioctl: 16 callbacks suppressed Feb 2 10:40:07 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition! Feb 2 10:40:17 ovmsrv05 multipathd: dm-15: remove map (uevent) Feb 2 10:40:17 ovmsrv05 multipathd: dm-15: devmap not registered, can't remove Feb 2 10:40:17 ovmsrv05 multipathd: dm-15: remove map (uevent) Feb 2 10:40:22 ovmsrv05 kernel: dd: sending ioctl 80306d02 to a partition!
Lets file a bug to investigate these "kernel: dd: sending ioctl 80306d02 to a partition!" messages. Please attach vdsm log on the machine emitting theses logs.
After about 7 rounds I get this in messages of the host where the VM is running:
Feb 1 23:31:39 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:39 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:44 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:45 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:45 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:47 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:50 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:50 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:56 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 23:31:57 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a partition!
This is interesting, we have seen this messages before, but could never detect the flow causing them, are you sure you see this each time you extend your disk?
If you can reproduce this, please file a bug.
Ok, see also above the registered message during the clone phase. Gianluca

On Thu, Feb 2, 2017 at 10:48 AM, Nir Soffer <nsoffer@redhat.com> wrote:
After about 7 rounds I get this in messages of the host where the VM is running:
Feb 1 23:31:39 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a
Feb 1 23:31:39 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a
Feb 1 23:31:44 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a
Feb 1 23:31:45 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a
Feb 1 23:31:45 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a
Feb 1 23:31:47 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a
Feb 1 23:31:50 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a
Feb 1 23:31:50 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a
Feb 1 23:31:56 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a
Feb 1 23:31:57 ovmsrv06 kernel: dd: sending ioctl 80306d02 to a
partition! partition! partition! partition! partition! partition! partition! partition! partition! partition!
This is interesting, we have seen this messages before, but could never detect the flow causing them, are you sure you see this each time you extend your disk?
If you can reproduce this, please file a bug.
I got the same on 4.1.1 while testing on iSCSI so I opened a bugzilla now... let's see if we can find anything usefule https://bugzilla.redhat.com/show_bug.cgi?id=1438809 Gianluca

On Wed, Feb 01, 2017 at 09:39:45AM +0200, Nir Soffer wrote:
On Tue, Jan 31, 2017 at 6:09 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Tue, Jan 31, 2017 at 3:23 PM, Nathanaël Blanchet <blanchet@abes.fr> wrote:
exactly the same issue by there with FC EMC domain storage...
I'm trying to mitigate inserting a timeout for my SAN devices but I'm not sure of its effectiveness as CentOS 7 behavior of "multipathd -k" and then "show config" seems different from CentOS 6.x In fact my attempt for multipath.conf is this
There was a significant change in how multipath deals with merging device configurations between RHEL6 and RHEL7. The short answer is, as long as you copy the entire existing configuration, and just change what you want changed (like you did), you can ignore the change. Also, multipath doesn't care if you quote numbers. If you want to verify that no_path_retry is being set as intented, you can run: # multipath -r -v3 | grep no_path_retry To reload you multipath devices with verbosity turned up. You shoudl see lines like: Feb 02 09:38:30 | mpatha: no_path_retry = 12 (controller setting) That will tell you what no_path_retry is set to. The configuration Nir suggested at the end of this email looks good to me. Now, here's the long answer: multipath allows you to merge device configurations. This means that as long as you put in the "vendor" and "product" strings, you only need to set the other values that you care about. On RHEL6, this would work device { vendor "IBM" product "^1814" no_path_retry 12 } And it would create a configuration that was exactly the same as the builtin config for this device, except that no_path_retry was set to 12. However, this wasn't as easy for users as it was supposed to be. Specifically, users would often add their device's vendor and product information, as well as whatever they wanted changed, and then be surprised when multipath didn't retain all the information from the builtin configuration as advertised. This is because they used the actual vendor and product strings for their device, but the builtin device configuration's vendor and product strings were regexes. In RHEL6, multipath only merged configurations if the vendor and product strings string matched. So users would try device { vendor "IBM" product "1814 FASt" no_path_retry 12 } and it wouldn't work as expected, since the product strings didn't match. To fix this, when RHEL7 checks if a user configuration should be merged with a builtin configuration, all that is required is that the user configuration's vendor and product strings regex match the builtin. This means that the above configuration will work as expected in RHEL7. However the first configuration won't because "^1814" doesn't regex match "^1814". This means that multipath would treat is as a completely new configuration, and not merge any values from the builtin configuration. You can reenable the RHEL6 behaviour in RHEL7 by setting hw_str_match yes in the defaults section. Now, because the builtin configurations could handle more than one device type per configuration, since they used regexes to match the vendor and product strings, multipath couldn't just remove the original builtin configuration when users added a new configuration that modified it. Otherwise, devices that regex matched the builtin configuration's vendor and product strings but not the user configuration's vendor and product strings wouldn't have any device configuration information. So multipath keeps the original builtin configuration as well as the new one. However, when it's time to assign a device configuration to a device, multipath looks through the device configurations list backwards, and finds the first match. This means that it will always use the user configuration instead of the builtin one (since new configurations get added to the end of the list). Like I said before, if you add all the values you want set in your configuration, instead of relying on them being merged from the builtin configuration, then you don't need to worry about any of this. -Ben
# VDSM REVISION 1.3 # VDSM PRIVATE
defaults { polling_interval 5 no_path_retry fail user_friendly_names no flush_on_last_del yes fast_io_fail_tmo 5 dev_loss_tmo 30 max_fds 4096 }
# Remove devices entries when overrides section is available. devices { device { # These settings overrides built-in devices settings. It does not apply # to devices without built-in settings (these use the settings in the # "defaults" section), or to devices defined in the "devices" section. # Note: This is not available yet on Fedora 21. For more info see # https://bugzilla.redhat.com/1253799 all_devs yes no_path_retry fail } device { vendor "IBM" product "^1814" product_blacklist "Universal Xport" path_grouping_policy "group_by_prio" path_checker "rdac" features "0" hardware_handler "1 rdac" prio "rdac" failback immediate rr_weight "uniform" no_path_retry "12"
Hi Gianluca,
This should be a number, not a string, maybe multipath is having trouble parsing this and it ignores your value?
} }
So I put exactly the default device config for my IBM/1814 device but no_path_retry set to 12.
Why 12?
This will do 12 retries, 5 seconds each when no path is available. This will block lvm commands for 60 seconds when no path is available, blocking other stuff in vdsm. Vdsm is not designed to handle this.
I recommend value of 4.
But note that this will is not related to the fact that your devices are not initialize properly after boot.
In CentOS 6.x when you do something like this, "show config" gives you the modified entry only for your device section. Instead in CentOS 7.3 it seems I get anyway the default one for IBM/1814 and also the customized one at the end of the output....
Maybe your device configuration does not match exactly the builtin config.
Two facts: - before I could reproduce the problem if I selected Maintenance Power Mgmt ---> Restart (tried 3 times with same behavior)
Instead if I executed in separate steps Maintenance Power Mgmt --> Stop wait a moment Power Mgmt --> Start
I didn't get problems (tried only one time...)
Maybe waiting a moment helps the storage/switches to clean up properly after a server is shut down?
Does your power management trigger a proper shutdown? I would avoid using it for normal shutdown.
With this "new" multipath config (to be confirmed if in effect, how?) I don't get the VM paused problem even with Restart option of Power Mgmt In active host messages I see these ones when the other reboots:
Jan 31 16:50:01 ovmsrv06 systemd: Started Session 705 of user root. Jan 31 16:50:01 ovmsrv06 systemd: Starting Session 705 of user root. Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sde - rdac checker reports path is up Jan 31 16:53:47 ovmsrv06 multipathd: 8:64: reinstated Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1] Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdo - rdac checker reports path is ghost Jan 31 16:53:47 ovmsrv06 multipathd: 8:224: reinstated Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdk - rdac checker reports path is up Jan 31 16:53:47 ovmsrv06 multipathd: 8:160: reinstated Jan 31 16:53:47 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command Jan 31 16:53:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdq - rdac checker reports path is ghost Jan 31 16:53:47 ovmsrv06 multipathd: 65:0: reinstated Jan 31 16:53:48 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT returned with sense 05/91/36 Jan 31 16:53:48 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT returned with sense 05/91/36 Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command Jan 31 16:53:49 ovmsrv06 kernel: sd 0:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT completed Jan 31 16:53:49 ovmsrv06 kernel: sd 2:0:1:4: rdac: array Z1_DS4700, ctlr 1, queueing MODE_SELECT command Jan 31 16:53:49 ovmsrv06 kernel: sd 2:0:1:4: rdac: array Z1_DS4700, ctlr 1, MODE_SELECT completed Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sde - rdac checker reports path is ghost Jan 31 16:53:52 ovmsrv06 multipathd: 8:64: reinstated Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1] Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdo - rdac checker reports path is up Jan 31 16:53:52 ovmsrv06 multipathd: 8:224: reinstated Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdk - rdac checker reports path is ghost Jan 31 16:53:52 ovmsrv06 multipathd: 8:160: reinstated Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: sdq - rdac checker reports path is up Jan 31 16:53:52 ovmsrv06 multipathd: 65:0: reinstated
But they are not related to the multipath device dedicated to oVirt storage domain in this case.... What lets me be optimistic seems the difference in these lines:
before I got Jan 31 10:27:47 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 0 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1]
now I get Jan 31 16:53:52 ovmsrv06 multipathd: 3600a0b8000299aa80000d08955014098: load table [0 41943040 multipath 1 queue_if_no_path 1 rdac 2 1 service-time 0 2 1 8:224 1 65:0 1 service-time 0 2 1 8:64 1 8:160 1]
multipath 0 1 rdac vs multipath 1 queue_if_no_path 1 rdac
This is not expected, multipath is using unlimited queueing, which is the worst setup for ovirt.
Maybe this is the result of using "12" instead of 12?
Anyway, looking in multipath source, this is the default configuration for your device:
405 /* DS3950 / DS4200 / DS4700 / DS5020 */ 406 .vendor = "IBM", 407 .product = "^1814", 408 .bl_product = "Universal Xport", 409 .pgpolicy = GROUP_BY_PRIO, 410 .checker_name = RDAC, 411 .features = "2 pg_init_retries 50", 412 .hwhandler = "1 rdac", 413 .prio_name = PRIO_RDAC, 414 .pgfailback = -FAILBACK_IMMEDIATE, 415 .no_path_retry = 30, 416 },
and this is the commit that updated this (and other rdac devices): http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commit;h=c1ed393b...
So I would try this configuration:
device { vendor "IBM" product "^1814"
# defaults from multipathd show config product_blacklist "Universal Xport" path_grouping_policy "group_by_prio" path_checker "rdac" hardware_handler "1 rdac" prio "rdac" failback immediate rr_weight "uniform"
# Based on multipath commit c1ed393b91acace284901f16954ba5c1c0d943c9 features "2 pg_init_retries 50"
# Default is 30 seconds, ovirt recommended value is 4 to avoid # blocking in vdsm. This gives 20 seconds (4 * polling_interval) # gracetime when no path is available. no_path_retry 4 }
Ben, do you have any other ideas on debugging this issue and improving multipath configuration?
Nir

On Thu, Feb 2, 2017 at 10:53 PM, Benjamin Marzinski <bmarzins@redhat.com> wrote:
I'm trying to mitigate inserting a timeout for my SAN devices but I'm not sure of its effectiveness as CentOS 7 behavior of "multipathd -k" and then "show config" seems different from CentOS 6.x In fact my attempt for multipath.conf is this
There was a significant change in how multipath deals with merging device configurations between RHEL6 and RHEL7. The short answer is, as long as you copy the entire existing configuration, and just change what you want changed (like you did), you can ignore the change. Also, multipath doesn't care if you quote numbers.
If you want to verify that no_path_retry is being set as intented, you can run:
# multipath -r -v3 | grep no_path_retry
Hi Benjamin, thank you very much for the explanations, especially the long one ;-) I tried and confirmed that I has no_path_retry = 4 as expected The regex matching is only for merge, correct? So in your example if in RH EL 7 I put this device { vendor "IBM" product "^1814" no_path_retry 12 } It would not match for merging, but it would match for applying to my device (because it is put at the end of config read backwards). And it would apply only the no_path_retry setting, while all other ones would not be picked from builtin configuration for device, but from defaults in general. So for example it would set path_checker not this way: path_checker "rdac" but this way: path_checker "directio" that is default.. correct?

On Fri, Feb 03, 2017 at 12:31:49AM +0100, Gianluca Cecchi wrote:
On Thu, Feb 2, 2017 at 10:53 PM, Benjamin Marzinski <[1]bmarzins@redhat.com> wrote:
> > I'm trying to mitigate inserting a timeout for my SAN devices but I'm not > > sure of its effectiveness as CentOS 7 behavior of "multipathd -k" and then > > "show config" seems different from CentOS 6.x > > In fact my attempt for multipath.conf is this
There was a significant change in how multipath deals with merging device configurations between RHEL6 and RHEL7. The short answer is, as long as you copy the entire existing configuration, and just change what you want changed (like you did), you can ignore the change. Also, multipath doesn't care if you quote numbers.
If you want to verify that no_path_retry is being set as intented, you can run:
# multipath -r -v3 | grep no_path_retry
Hi Benjamin, thank you very much for the explanations, especially the long one ;-) I tried and confirmed that I has no_path_retry = 4 as expected The regex matching is only for merge, correct?
No. Both RHEL6 and RHEL7 use regex matching to determine which device configuration to use with your device, otherwise product "^1814" would never match any device, since there is no array with a literal product string of "^1814". RHEL7 also uses the same regex matching to determine which builtin device configuration a user-supplied device configuration should modify. RHEL6 uses string matching for this.
So in your example if in RH EL 7 I put this device { vendor "IBM" product "^1814" no_path_retry 12 } It would not match for merging, but it would match for applying to my device (because it is put at the end of config read backwards).
correct. The confusing point is that in the merging case, "^1814" in the user-supplied configuration is being treaded as a string that needs to regex match the regular expression "^1814" in the builtin configuration. These don't match. For matching the device configuration to the device, "^1814" in the user-supplied configuration is being treated as a regular expression that needs to regex match the actual product string of the device.
And it would apply only the no_path_retry setting, while all other ones would not be picked from builtin configuration for device, but from defaults in general. So for example it would set path_checker not this way: path_checker "rdac" but this way: path_checker "directio" that is default.. correct?
exactly. -Ben
References
Visible links 1. mailto:bmarzins@redhat.com

This is a multi-part message in MIME format. --------------06F287541BC9D8DB37199903 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit I'm also seeing this error using a Dell MD3800i array. The multipath errors shown in our logs are different however. Feb 1 15:11:58 ovirt-node-production2 kernel: dd: sending ioctl 80306d02 to a partition! Feb 1 15:21:01 ovirt-node-production2 multipathd: dm-31: remove map (uevent) Feb 1 15:21:01 ovirt-node-production2 multipathd: dm-31: devmap not registered, can't remove Feb 1 15:21:01 ovirt-node-production2 multipathd: dm-31: remove map (uevent) The dd error seems to happen every time that SPM runs a test. On 01/31/2017 09:23 AM, Nathanaël Blanchet wrote:
exactly the same issue by there with FC EMC domain storage...
Le 31/01/2017 à 15:20, Gianluca Cecchi a écrit :
Hello, my test environment is composed by 2 old HP blades BL685c G1 (ovmsrv05 and ovmsrv06) and they are connected in a SAN with FC-switches to an old IBM DS4700 storage array. Apart from being old, they seem all ok from an hw point of view. I have configured oVirt 4.0.6 and an FCP storage domain. The hosts are plain CentOS 7.3 servers fully updated. It is not an hosted engine environment: the manager is a vm outside of the cluster. I have configured power mgmt on both and it works good.
I have at the moment only one VM for test and it is doing quite nothing.
Starting point: ovmsrv05 is in maintenance (since about 2 days) and the VM is running on ovmsrv06. I update qemu-kvm package on ovmsrv05 and then I restart it from web admin gui: Power Mgmt --> Restart
Sequence of events in pane and the problem in subject: Jan 31, 2017 10:29:43 AM Host ovmsrv05 power management was verified successfully. Jan 31, 2017 10:29:43 AM Status of host ovmsrv05 was set to Up. Jan 31, 2017 10:29:38 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:29:29 AM Activation of host ovmsrv05 initiated by admin@internal-authz. Jan 31, 2017 10:28:05 AM VM ol65 has recovered from paused back to up. Jan 31, 2017 10:27:55 AM VM ol65 has been paused due to storage I/O problem. Jan 31, 2017 10:27:55 AM VM ol65 has been paused. Jan 31, 2017 10:25:52 AM Host ovmsrv05 was restarted by admin@internal-authz. Jan 31, 2017 10:25:52 AM Host ovmsrv05 was started by admin@internal-authz. Jan 31, 2017 10:25:52 AM Power management start of Host ovmsrv05 succeeded. Jan 31, 2017 10:25:50 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:25:37 AM Executing power management start on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:25:37 AM Power management start of Host ovmsrv05 initiated. Jan 31, 2017 10:25:37 AM Auto fence for host ovmsrv05 was started. Jan 31, 2017 10:25:37 AM All VMs' status on Non Responsive Host ovmsrv05 were changed to 'Down' by admin@internal-authz Jan 31, 2017 10:25:36 AM Host ovmsrv05 was stopped by admin@internal-authz. Jan 31, 2017 10:25:36 AM Power management stop of Host ovmsrv05 succeeded. Jan 31, 2017 10:25:34 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:25:15 AM Executing power management stop on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212. Jan 31, 2017 10:25:15 AM Power management stop of Host ovmsrv05 initiated. Jan 31, 2017 10:25:12 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.
Watching the timestamps, the culprit seems the reboot time of ovmsrv05 that detects some LUNs in owned state and other ones in unowned Full messages of both hosts here: https://drive.google.com/file/d/0BwoPbcrMv8mvekZQT1pjc0NMRlU/view?usp=sharin... and https://drive.google.com/file/d/0BwoPbcrMv8mvcjBCYVdFZWdXTms/view?usp=sharin...
At this time there are 4 LUNs globally seen by the two hosts but only 1 of them is currently configured as the only storage domain in oVirt cluster.
[root@ovmsrv05 ~]# multipath -l | grep ^36 3600a0b8000299aa80000d08b55014119 dm-5 IBM ,1814 FAStT 3600a0b80002999020000cd3c5501458f dm-3 IBM ,1814 FAStT 3600a0b80002999020000ccf855011198 dm-2 IBM ,1814 FAStT 3600a0b8000299aa80000d08955014098 dm-4 IBM ,1814 FAStT
the configured one: [root@ovmsrv05 ~]# multipath -l 3600a0b8000299aa80000d08b55014119 3600a0b8000299aa80000d08b55014119 dm-5 IBM ,1814 FAStT size=4.0T features='0' hwhandler='1 rdac' wp=rw |-+- policy='service-time 0' prio=0 status=active | |- 0:0:1:3 sdl 8:176 active undef running | `- 2:0:1:3 sdp 8:240 active undef running `-+- policy='service-time 0' prio=0 status=enabled |- 0:0:0:3 sdd 8:48 active undef running `- 2:0:0:3 sdi 8:128 active undef running
In mesages of booting node, arounf the problem registered by the storage: [root@ovmsrv05 ~]# grep owned /var/log/messages Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:1: rdac: LUN 1 (RDAC) (owned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:2: rdac: LUN 2 (RDAC) (owned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:3: rdac: LUN 3 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:1: rdac: LUN 1 (RDAC) (owned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:4: rdac: LUN 4 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:2: rdac: LUN 2 (RDAC) (owned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:1: rdac: LUN 1 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:3: rdac: LUN 3 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:4: rdac: LUN 4 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:2: rdac: LUN 2 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:1: rdac: LUN 1 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:3: rdac: LUN 3 (RDAC) (owned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:2: rdac: LUN 2 (RDAC) (unowned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:4: rdac: LUN 4 (RDAC) (owned) Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:3: rdac: LUN 3 (RDAC) (owned) Jan 31 10:27:39 ovmsrv05 kernel: scsi 2:0:1:4: rdac: LUN 4 (RDAC) (owned)
I don't know exactly the meaning of owned/unowned in the output above.. Possibly it detects the 0:0:1:3 and 2:0:1:3 paths (those of the active group) as "owned" and this could have created problems with the active node?
On active node strangely I don't loose all the paths, but the VM has been paused anyway
[root@ovmsrv06 log]# grep "remaining active path" /var/log/messages Jan 31 10:27:48 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 3 Jan 31 10:27:49 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 2 Jan 31 10:27:56 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 3 Jan 31 10:27:56 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 2 Jan 31 10:27:56 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 1 Jan 31 10:27:57 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 2 Jan 31 10:28:01 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 3 Jan 31 10:28:01 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 4
I'm not an expert of this storage array in particular, and of the rdac hardware handler in general.
What I see is that multipath.conf on both nodes:
# VDSM REVISION 1.3
defaults { polling_interval 5 no_path_retry fail user_friendly_names no flush_on_last_del yes fast_io_fail_tmo 5 dev_loss_tmo 30 max_fds 4096 }
devices { device { # These settings overrides built-in devices settings. It does not apply # to devices without built-in settings (these use the settings in the # "defaults" section), or to devices defined in the "devices" section. # Note: This is not available yet on Fedora 21. For more info see # https://bugzilla.redhat.com/1253799 all_devs yes no_path_retry fail } }
beginning of /proc/scsi/scsi
[root@ovmsrv06 ~]# cat /proc/scsi/scsi Attached devices: Host: scsi1 Channel: 01 Id: 00 Lun: 00 Vendor: HP Model: LOGICAL VOLUME Rev: 1.86 Type: Direct-Access ANSI SCSI revision: 05 Host: scsi0 Channel: 00 Id: 00 Lun: 01 Vendor: IBM Model: 1814 FAStT Rev: 0916 Type: Direct-Access ANSI SCSI revision: 05 ...
To get default acquired config for this storage:
multpathd -k
show config
I can see:
device { vendor "IBM" product "^1814" product_blacklist "Universal Xport" path_grouping_policy "group_by_prio" path_checker "rdac" features "0" hardware_handler "1 rdac" prio "rdac" failback immediate rr_weight "uniform" no_path_retry "fail" }
and
defaults { verbosity 2 polling_interval 5 max_polling_interval 20 reassign_maps "yes" multipath_dir "/lib64/multipath" path_selector "service-time 0" path_grouping_policy "failover" uid_attribute "ID_SERIAL" prio "const" prio_args "" features "0" path_checker "directio" alias_prefix "mpath" failback "manual" rr_min_io 1000 rr_min_io_rq 1 max_fds 4096 rr_weight "uniform" no_path_retry "fail" queue_without_daemon "no" flush_on_last_del "yes" user_friendly_names "no" fast_io_fail_tmo 5 dev_loss_tmo 30 bindings_file "/etc/multipath/bindings" wwids_file /etc/multipath/wwids log_checker_err always find_multipaths no retain_attached_hw_handler no detect_prio no hw_str_match no force_sync no deferred_remove no ignore_new_boot_devs no skip_kpartx no config_dir "/etc/multipath/conf.d" delay_watch_checks no delay_wait_checks no retrigger_tries 3 retrigger_delay 10 missing_uev_wait_timeout 30 new_bindings_in_boot no }
Any hint on how to tune multipath.conf so that a powering on server doesn't create problems to running VMs?
Thanks in advance, Gianluca
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
-- Nathanaël Blanchet
Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
--------------06F287541BC9D8DB37199903 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=windows-1252" http-equiv="Content-Type"> </head> <body text="#000000" bgcolor="#FFFFFF"> <p>I'm also seeing this error using a Dell MD3800i array. The multipath errors shown in our logs are different however.</p> <p>Feb 1 15:11:58 ovirt-node-production2 kernel: dd: sending ioctl 80306d02 to a partition!<br> Feb 1 15:21:01 ovirt-node-production2 multipathd: dm-31: remove map (uevent)<br> Feb 1 15:21:01 ovirt-node-production2 multipathd: dm-31: devmap not registered, can't remove<br> Feb 1 15:21:01 ovirt-node-production2 multipathd: dm-31: remove map (uevent)<br> </p> <p>The dd error seems to happen every time that SPM runs a test.<br> </p> <br> <div class="moz-cite-prefix">On 01/31/2017 09:23 AM, Nathanaël Blanchet wrote:<br> </div> <blockquote cite="mid:304eaca8-f962-d994-1607-dbbdd265f0d5@abes.fr" type="cite"> <meta content="text/html; charset=windows-1252" http-equiv="Content-Type"> <p>exactly the same issue by there with FC EMC domain storage...</p> <br> <div class="moz-cite-prefix">Le 31/01/2017 à 15:20, Gianluca Cecchi a écrit :<br> </div> <blockquote cite="mid:CAG2kNCxWKBRLZ43OTXfAVJUPpLd3vNbN1ckw3sq2ZCr=tY+CtQ@mail.gmail.com" type="cite"> <div dir="ltr">Hello, <div>my test environment is composed by 2 old HP blades BL685c G1 (ovmsrv05 and ovmsrv06) and they are connected in a SAN with FC-switches to an old IBM DS4700 storage array.</div> <div>Apart from being old, they seem all ok from an hw point of view.</div> <div>I have configured oVirt 4.0.6 and an FCP storage domain.</div> <div>The hosts are plain CentOS 7.3 servers fully updated.</div> <div>It is not an hosted engine environment: the manager is a vm outside of the cluster.</div> <div>I have configured power mgmt on both and it works good.</div> <div><br> </div> <div>I have at the moment only one VM for test and it is doing quite nothing.<br> </div> <div><br> </div> <div>Starting point: ovmsrv05 is in maintenance (since about 2 days) and the VM is running on ovmsrv06.</div> <div>I update qemu-kvm package on ovmsrv05 and then I restart it from web admin gui:</div> <div>Power Mgmt --> Restart</div> <div><br> </div> <div>Sequence of events in pane and the problem in subject:</div> <div> <div>Jan 31, 2017 10:29:43 AM Host ovmsrv05 power management was verified successfully.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:29:43 AM Status of host ovmsrv05 was set to Up.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:29:38 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:29:29 AM Activation of host ovmsrv05 initiated by admin@internal-authz.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:28:05 AM VM ol65 has recovered from paused back to up.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:27:55 AM VM ol65 has been paused due to storage I/O problem.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:27:55 AM VM ol65 has been paused.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:52 AM Host ovmsrv05 was restarted by admin@internal-authz.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:52 AM Host ovmsrv05 was started by admin@internal-authz.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:52 AM Power management start of Host ovmsrv05 succeeded.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:50 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:37 AM Executing power management start on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:37 AM Power management start of Host ovmsrv05 initiated.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:37 AM Auto fence for host ovmsrv05 was started.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:37 AM All VMs' status on Non Responsive Host ovmsrv05 were changed to 'Down' by admin@internal-authz</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:36 AM Host ovmsrv05 was stopped by admin@internal-authz.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:36 AM Power management stop of Host ovmsrv05 succeeded.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:34 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:15 AM Executing power management stop on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:15 AM Power management stop of Host ovmsrv05 initiated.</div> <div><span class="gmail-Apple-tab-span" style="white-space:pre"> </span></div> <div>Jan 31, 2017 10:25:12 AM Executing power management status on Host ovmsrv05 using Proxy Host ovmsrv06 and Fence Agent ilo:10.4.192.212.</div> </div> <div><br> </div> <div>Watching the timestamps, the culprit seems the reboot time of ovmsrv05 that detects some LUNs in owned state and other ones in unowned</div> <div>Full messages of both hosts here:</div> <div><a moz-do-not-send="true" href="https://drive.google.com/file/d/0BwoPbcrMv8mvekZQT1pjc0NMRlU/view?usp=sharing">https://drive.google.com/file/d/0BwoPbcrMv8mvekZQT1pjc0NMRlU/view?usp=sharing</a><br> </div> <div>and</div> <div><a moz-do-not-send="true" href="https://drive.google.com/file/d/0BwoPbcrMv8mvcjBCYVdFZWdXTms/view?usp=sharing">https://drive.google.com/file/d/0BwoPbcrMv8mvcjBCYVdFZWdXTms/view?usp=sharing</a><br> </div> <div><br> </div> <div>At this time there are 4 LUNs globally seen by the two hosts but only 1 of them is currently configured as the only storage domain in oVirt cluster.</div> <div><br> </div> <div> <div>[root@ovmsrv05 ~]# multipath -l | grep ^36</div> <div>3600a0b8000299aa80000d08b55014119 dm-5 IBM ,1814 FAStT </div> <div>3600a0b80002999020000cd3c5501458f dm-3 IBM ,1814 FAStT </div> <div>3600a0b80002999020000ccf855011198 dm-2 IBM ,1814 FAStT </div> <div>3600a0b8000299aa80000d08955014098 dm-4 IBM ,1814 FAStT </div> </div> <div><br> </div> <div>the configured one:</div> <div> <div>[root@ovmsrv05 ~]# multipath -l 3600a0b8000299aa80000d08b55014119</div> <div>3600a0b8000299aa80000d08b55014119 dm-5 IBM ,1814 FAStT </div> <div>size=4.0T features='0' hwhandler='1 rdac' wp=rw</div> <div>|-+- policy='service-time 0' prio=0 status=active</div> <div>| |- 0:0:1:3 sdl 8:176 active undef running</div> <div>| `- 2:0:1:3 sdp 8:240 active undef running</div> <div>`-+- policy='service-time 0' prio=0 status=enabled</div> <div> |- 0:0:0:3 sdd 8:48 active undef running</div> <div> `- 2:0:0:3 sdi 8:128 active undef running</div> </div> <div><br> </div> <div>In mesages of booting node, arounf the problem registered by the storage:</div> <div> <div>[root@ovmsrv05 ~]# grep owned /var/log/messages</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:1: rdac: LUN 1 (RDAC) (owned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:2: rdac: LUN 2 (RDAC) (owned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:3: rdac: LUN 3 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:1: rdac: LUN 1 (RDAC) (owned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:0:4: rdac: LUN 4 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:2: rdac: LUN 2 (RDAC) (owned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:1: rdac: LUN 1 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:3: rdac: LUN 3 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:0:4: rdac: LUN 4 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:2: rdac: LUN 2 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:1: rdac: LUN 1 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:3: rdac: LUN 3 (RDAC) (owned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:2: rdac: LUN 2 (RDAC) (unowned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 0:0:1:4: rdac: LUN 4 (RDAC) (owned)</div> <div>Jan 31 10:27:38 ovmsrv05 kernel: scsi 2:0:1:3: rdac: LUN 3 (RDAC) (owned)</div> <div>Jan 31 10:27:39 ovmsrv05 kernel: scsi 2:0:1:4: rdac: LUN 4 (RDAC) (owned)</div> </div> <div><br> </div> <div>I don't know exactly the meaning of owned/unowned in the output above..</div> <div>Possibly it detects the 0:0:1:3 and 2:0:1:3 paths (those of the active group) as "owned" and this could have created problems with the active node?</div> <div><br> </div> <div>On active node strangely I don't loose all the paths, but the VM has been paused anyway</div> <div><br> </div> <div> <div>[root@ovmsrv06 log]# grep "remaining active path" /var/log/messages </div> <div>Jan 31 10:27:48 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 3</div> <div>Jan 31 10:27:49 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 2</div> <div>Jan 31 10:27:56 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 3</div> <div>Jan 31 10:27:56 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 2</div> <div>Jan 31 10:27:56 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 1</div> <div>Jan 31 10:27:57 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 2</div> <div>Jan 31 10:28:01 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 3</div> <div>Jan 31 10:28:01 ovmsrv06 multipathd: 3600a0b8000299aa80000d08b55014119: remaining active paths: 4</div> </div> <div><br> </div> <div>I'm not an expert of this storage array in particular, and of the rdac hardware handler in general.</div> <div><br> </div> <div>What I see is that multipath.conf on both nodes:</div> <div><br> </div> <div> <div># VDSM REVISION 1.3</div> <div><br> </div> <div>defaults {</div> <div> polling_interval 5</div> <div> no_path_retry fail</div> <div> user_friendly_names no</div> <div> flush_on_last_del yes</div> <div> fast_io_fail_tmo 5</div> <div> dev_loss_tmo 30</div> <div> max_fds 4096</div> <div>}</div> <div><br> </div> <div><br> </div> <div>devices {</div> <div> device {</div> <div> # These settings overrides built-in devices settings. It does not apply</div> <div> # to devices without built-in settings (these use the settings in the</div> <div> # "defaults" section), or to devices defined in the "devices" section.</div> <div> # Note: This is not available yet on Fedora 21. For more info see</div> <div> # <a moz-do-not-send="true" href="https://bugzilla.redhat.com/1253799">https://bugzilla.redhat.com/1253799</a></div> <div> all_devs yes</div> <div> no_path_retry fail</div> <div> }</div> <div>}</div> </div> <div><br> </div> <div><br> </div> <div> <div>beginning of /proc/scsi/scsi</div> <div><br> </div> <div>[root@ovmsrv06 ~]# cat /proc/scsi/scsi </div> <div>Attached devices:</div> <div>Host: scsi1 Channel: 01 Id: 00 Lun: 00</div> <div> Vendor: HP Model: LOGICAL VOLUME Rev: 1.86</div> <div> Type: Direct-Access ANSI SCSI revision: 05</div> <div>Host: scsi0 Channel: 00 Id: 00 Lun: 01</div> <div> Vendor: IBM Model: 1814 FAStT Rev: 0916</div> <div> Type: Direct-Access ANSI SCSI revision: 05</div> </div> <div>...</div> <div><br> </div> <div>To get default acquired config for this storage:</div> <div> <div><br> </div> <div>multpathd -k</div> <div>> show config</div> <div><br> </div> <div>I can see:</div> <div><br> </div> <div> device {</div> <div> vendor "IBM"</div> <div> product "^1814"</div> <div> product_blacklist "Universal Xport"</div> <div> path_grouping_policy "group_by_prio"</div> <div> path_checker "rdac"</div> <div> features "0"</div> <div> hardware_handler "1 rdac"</div> <div> prio "rdac"</div> <div> failback immediate</div> <div> rr_weight "uniform"</div> <div> no_path_retry "fail"</div> <div> }</div> <div><br> </div> <div><br> </div> <div>and</div> <div><br> </div> <div>defaults {</div> <div> verbosity 2</div> <div> polling_interval 5</div> <div> max_polling_interval 20</div> <div> reassign_maps "yes"</div> <div> multipath_dir "/lib64/multipath"</div> <div> path_selector "service-time 0"</div> <div> path_grouping_policy "failover"</div> <div> uid_attribute "ID_SERIAL"</div> <div> prio "const"</div> <div> prio_args ""</div> <div> features "0"</div> <div> path_checker "directio"</div> <div> alias_prefix "mpath"</div> <div> failback "manual"</div> <div> rr_min_io 1000</div> <div> rr_min_io_rq 1</div> <div> max_fds 4096</div> <div> rr_weight "uniform"</div> <div> no_path_retry "fail"</div> <div> queue_without_daemon "no"</div> <div> flush_on_last_del "yes"</div> <div> user_friendly_names "no"</div> <div> fast_io_fail_tmo 5</div> <div> dev_loss_tmo 30</div> <div> bindings_file "/etc/multipath/bindings"</div> <div> wwids_file /etc/multipath/wwids</div> <div> log_checker_err always</div> <div> find_multipaths no</div> <div> retain_attached_hw_handler no</div> <div> detect_prio no</div> <div> hw_str_match no</div> <div> force_sync no</div> <div> deferred_remove no</div> <div> ignore_new_boot_devs no</div> <div> skip_kpartx no</div> <div> config_dir "/etc/multipath/conf.d"</div> <div> delay_watch_checks no</div> <div> delay_wait_checks no</div> <div> retrigger_tries 3</div> <div> retrigger_delay 10</div> <div> missing_uev_wait_timeout 30</div> <div> new_bindings_in_boot no</div> <div>}</div> <div><br> </div> </div> <div>Any hint on how to tune multipath.conf so that a powering on server doesn't create problems to running VMs?</div> <div><br> </div> <div>Thanks in advance,</div> <div>Gianluca</div> </div> <br> <fieldset class="mimeAttachmentHeader"></fieldset> <br> <pre wrap="">_______________________________________________ Users mailing list <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a> <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a> </pre> </blockquote> <br> <pre class="moz-signature" cols="72">-- Nathanaël Blanchet Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:blanchet@abes.fr">blanchet@abes.fr</a> </pre> <br> <fieldset class="mimeAttachmentHeader"></fieldset> <br> <pre wrap="">_______________________________________________ Users mailing list <a class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a> <a class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a> </pre> </blockquote> <br> </body> </html> --------------06F287541BC9D8DB37199903--
participants (6)
-
Benjamin Marzinski
-
Gianluca Cecchi
-
Michael Watters
-
Nathanaël Blanchet
-
Nir Soffer
-
Yaniv Kaul