
On 21.05.2015 02:48, Chris Jones - BookIt.com Systems Administrator wrote:
Another issue may be that the setting for COMPELNT/Compellent Vol are wrong; the setting we ship is missing lot of settings that exists in the builtin setting, and this may have bad effect. If your devices match this , I would try this multipath configuration, instead of the one vdsm configures.
device { vendor "COMPELNT" product "Compellent Vol" path_grouping_policy "multibus" path_checker "tur" features "0" hardware_handler "0" prio "const" failback "immediate" rr_weight "uniform" no_path_retry fail }
I wish I could. We're using the CentOS 7 ovirt-node-iso. The multipath.conf is less than ideal I have this issue also. I think about opening a BZ ;)
but when I tried updating it, oVirt
instantly overwrites it. To be clear, yes I know changes do not survive reboots and yes I know about persist, but it changes it while running. Live! Persist won't help there.
I also tried building a CentOS 7 "thick client" where I set up CentOS 7 first, added the oVirt repo, then let the engine provision it. Same problem with multipath.conf being overwritten with the default oVirt setup.
So I tried to be slick about it. I made the multipath.conf immutable. That prevented the engine from being able to activate the node. It would fail on a vds command that gets the nodes capabilities and part of what it does is reads then overwrites multipath.conf.
How do I safely update multipath.conf?
In the second line of your multipath conf, add: # RHEV PRIVATE Then, host deploy will ignore it and never change it.
To verify that your devices match this, you can check the devices vendor and procut strings in the output of "multipath -ll". I would like to see the output of this command.
multipath -ll (default setup) can be seen here. http://paste.linux-help.org/view/430c7538
Another platform issue is bad default SCSI node.session.timeo.replacement_timeout value, which is set to 120 seconds. This setting mean that the SCSI layer will wait 120 seconds for io to complete on one path, before failing the io request. So you may have one bad path, causing 120 second delay, while you could complete the request using another path.
Multipath is trying to set this value to 5 seconds, but this value is reverting to the default 120 seconds after a device has trouble. There is an open bug about this which we hope to get fixed in the rhel/centos 7.2. https://bugzilla.redhat.com/1139038
This issue together with "no_path_retry queue" is a very bad mix for ovirt.
You can fix this timeout by setting:
# /etc/iscsi/iscsid.conf node.session.timeo.replacement_timeout = 5
I'll see if that's possible with persist. Will this change survive node upgrades?
Thanks for the reply and the suggestions. _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
-- Daniel Helgenberger m box bewegtbild GmbH P: +49/30/2408781-22 F: +49/30/2408781-10 ACKERSTR. 19 D-10115 BERLIN www.m-box.de www.monkeymen.tv Geschäftsführer: Martin Retschitzegger / Michaela Göllner Handeslregister: Amtsgericht Charlottenburg / HRB 112767