High latency on storage domains and sanlock renewal error

Hi, the engine logs show high latency on storage domains: "Storage domain <xxxx> experienced a high latency of 19.2814 seconds from ...... This may cause performance and functional issues." Looking at host logs, I found also these locking errors: 2017-05-02 20:52:13+0200 33883 [10098]: s1 renewal error -202 delta_length 10 last_success 33853 2017-05-02 20:52:19+0200 33889 [10098]: 6a386652 aio collect 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe9fb000 result 1048576:0 other free 2017-05-02 21:08:51+0200 34880 [10098]: 6a386652 aio timeout 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe4f2000 ioto 10 to_count 24 2017-05-02 21:08:51+0200 34880 [10098]: s1 delta_renew read rv -202 offset 0 /dev/6a386652-629d-4045-835b-21d2f5c104aa/ids 2017-05-02 21:08:51+0200 34880 [10098]: s1 renewal error -202 delta_length 10 last_success 34850 2017-05-02 21:08:53+0200 34883 [10098]: 6a386652 aio collect 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe4f2000 result 1048576:0 other free 2017-05-02 21:30:40+0200 36189 [10098]: 6a386652 aio timeout 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe9fb000 ioto 10 to_count 25 2017-05-02 21:30:40+0200 36189 [10098]: s1 delta_renew read rv -202 offset 0 /dev/6a386652-629d-4045-835b-21d2f5c104aa/ids 2017-05-02 21:30:40+0200 36189 [10098]: s1 renewal error -202 delta_length 10 last_success 36159 2017-05-02 21:30:45+0200 36195 [10098]: 6a386652 aio collect 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe9fb000 result 1048576:0 other free and this vdsm errors too: Thread-22::ERROR::2017-05-02 21:53:48,147::sdc::137:: Storage.StorageDomainCache::(_findDomain) looking for unfetched domain f8f21d6c-2425-45c4-aded-4cb9b53ebd96 Thread-22::ERROR::2017-05-02 21:53:48,148::sdc::154:: Storage.StorageDomainCache::(_findUnfetchedDomain) looking for domain f8f21d6c-2425-45c4-aded-4cb9b53ebd96 Engine instead is showing this errors: 2017-05-02 21:40:38,089 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-96) Command SpmStatusVDSCommand(HostName = < myhost.example.com>, HostId = dcc0275a-b011-4e33-bb95-366ffb0697b3, storagePoolId = 715d1ba2-eabe-48db-9aea-c28c30359808) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SpmStatusVDS, error = (-202, 'Sanlock resource read failure', 'Sanlock exception'), code = 100 2017-05-02 21:41:08,431 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-53) [6e0d5ebf] Failed in SpmStatusVDS method 2017-05-02 21:41:08,443 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-53) [6e0d5ebf] Command SpmStatusVDSCommand(HostName = <myhost.example.com>, HostId = 7991933e-5f30-48cd-88bf-b0b525613384, storagePoolId = 4bd73239-22d0-4c44-ab8c-17adcd580309) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SpmStatusVDS, error = (-202, 'Sanlock resource read failure', 'Sanlock exception'), code = 100 2017-05-02 21:41:31,975 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-61) [2a54a1b2] Failed in SpmStatusVDS method 2017-05-02 21:41:31,987 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-61) [2a54a1b2] Command SpmStatusVDSCommand(HostName = <myhost.example.com>, HostId = dcc0275a-b011-4e33-bb95-366ffb0697b3, storagePoolId = 715d1ba2-eabe-48db-9aea-c28c30359808) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SpmStatusVDS, error = (-202, 'Sanlock resource read failure', 'Sanlock exception'), code = 100 I'm using Fibre Channel or FCoE connectivity; storage array technical support has analyzed it (also switch and OS configurations), but nothing has been found. Any advice? Thanks Installation info: ovirt-release35-006-1.noarch libgovirt-0.3.3-1.el7_2.1.x86_64 vdsm-4.16.30-0.el7.centos.x86_64 vdsm-xmlrpc-4.16.30-0.el7.centos.noarch vdsm-yajsonrpc-4.16.30-0.el7.centos.noarch vdsm-jsonrpc-4.16.30-0.el7.centos.noarch vdsm-python-zombiereaper-4.16.30-0.el7.centos.noarch vdsm-python-4.16.30-0.el7.centos.noarch vdsm-cli-4.16.30-0.el7.centos.noarch qemu-kvm-ev-2.3.0-29.1.el7.x86_64 qemu-kvm-common-ev-2.3.0-29.1.el7.x86_64 qemu-kvm-tools-ev-2.3.0-29.1.el7.x86_64 libvirt-client-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-storage-1.2.17-13.el7_2.3.x86_64 libvirt-python-1.2.17-2.el7.x86_64 libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.3.x86_64 libvirt-lock-sanlock-1.2.17-13.el7_2.3.x86_64 libvirt-glib-0.1.9-1.el7.x86_64 libvirt-daemon-driver-network-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-lxc-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-interface-1.2.17-13.el7_2.3.x86_64 libvirt-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-config-network-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-secret-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-kvm-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-qemu-1.2.17-13.el7_2.3.x86_64 ------- vdsm.log (high latency) Thread-21::DEBUG::2017-05-02 18:02:23,646::fileSD::261:: Storage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '0+1 records in\n0+1 records out\n331 bytes (331 B) copied, 0.000285529 s, 1.2 MB/s\n'; <rc> = 0 Thread-18::DEBUG::2017-05-02 18:02:24,335::blockSD::596:: Storage.Misc.excCmd::(getReadDelay) /bin/dd if=/dev/2c501858-bf8d-49a5-a42b-bca341b47827/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None) Thread-18::DEBUG::2017-05-02 18:02:24,343::blockSD::596:: Storage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000489782 s, 8.4 MB/s\n'; <rc> = 0 Thread-19::DEBUG::2017-05-02 18:02:24,635::blockSD::596:: Storage.Misc.excCmd::(getReadDelay) /bin/dd if=/dev/c5dc09c4-cd79-42d2-a5af-89953076f79e/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None) Thread-19::DEBUG::2017-05-02 18:02:24,643::blockSD::596:: Storage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000558622 s, 7.3 MB/s\n'; <rc> = 0 Thread-22::DEBUG::2017-05-02 18:02:24,722::blockSD::596:: Storage.Misc.excCmd::(getReadDelay) /bin/dd if=/dev/f8f21d6c-2425-45c4-aded-4cb9b53ebd96/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None) Thread-22::DEBUG::2017-05-02 18:02:24,729::blockSD::596:: Storage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000632576 s, 6.5 MB/s\n'; <rc> = 0 JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:25,056::stompReactor:: 98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:25,057::__init__::506:: jsonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9372::DEBUG::2017-05-02 18:02:25,060::stompReactor:: 163::yajsonrpc.StompServer::(send) Sending response Thread-20::DEBUG::2017-05-02 18:02:25,895::blockSD::596:: Storage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 19.2814 s, 0.2 kB/s\n'; <rc> = 0 JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:28,075::stompReactor:: 98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:28,076::__init__::506:: jsonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9373::DEBUG::2017-05-02 18:02:28,078::stompReactor:: 163::yajsonrpc.StompServer::(send) Sending response JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:31,094::stompReactor:: 98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:31,095::__init__::506:: jsonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9374::DEBUG::2017-05-02 18:02:31,097::stompReactor:: 163::yajsonrpc.StompServer::(send) Sending response Thread-21::DEBUG::2017-05-02 18:02:33,652::fileSD::261:: Storage.Misc.excCmd::(getReadDelay) /bin/dd if=/rhev/data-center/mnt/<myho st.example.com>:_ISO/f105bdc6-efdc-445c-b49e-aa38c91c2569/dom_md/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None) Thread-21::DEBUG::2017-05-02 18:02:33,660::fileSD::261:: Storage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '0+1 records in\n0+1 records out\n331 bytes (331 B) copied, 0.000246594 s, 1.3 MB/s\n'; <rc> = 0 JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:34,112::stompReactor:: 98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:34,114::__init__::506:: jsonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9375::DEBUG::2017-05-02 18:02:34,117::task::595:: Storage.TaskManager.Task::(_updateState) Task=`ea63814e-d25d-4436-8bb5-cfc67af6498d`::moving from state init -> state preparing Thread-9375::INFO::2017-05-02 18:02:34,117::logUtils::44::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-9375::INFO::2017-05-02 18:02:34,118::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {'2c501858-bf8d-49a5-a42b-bca341b47827': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000489782', 'lastCheck': '9.8', 'valid': True}, 'f8f21d6c-2425-45c4-aded-4cb9b53ebd96': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000632576', 'lastCheck': '9.4', 'valid': True}, '6a386652-629d-4045-835b-21d2f5c104aa': {'code': 0, 'version': 3, 'acquired': True, 'delay': '19.2814', 'lastCheck': '8.2', 'valid': True}, 'f105bdc6-efdc-445c-b49e-aa38c91c2569': {'code': 0, 'version': 0, 'acquired': True, 'delay': '0.000246594', 'lastCheck': '0.5', 'valid': True}, 'c5dc09c4-cd79-42d2-a5af-89953076f79e': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000558622', 'lastCheck': '9.5', 'valid': True}} Thread-9375::DEBUG::2017-05-02 18:02:34,118::task::1191:: Storage.TaskManager.Task::(prepare) Task=`ea63814e-d25d-4436-8bb5-cfc67af6498d`::finished: {'2c501858-bf8d-49a5-a42b-bca341b47827': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000489782', 'lastCheck': '9.8', 'valid': True}, 'f8f21d6c-2425-45c4-aded-4cb9b53ebd96': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000632576', 'lastCheck': '9.4', 'valid': True}, '6a386652-629d-4045-835b-21d2f5c104aa': {'code': 0, 'version': 3, 'acquired': True, 'delay': '19.2814', 'lastCheck': '8.2', 'valid': True}, 'f105bdc6-efdc-445c-b49e-aa38c91c2569': {'code': 0, 'version': 0, 'acquired': True, 'delay': '0.000246594', 'lastCheck': '0.5', 'valid': True}, 'c5dc09c4-cd79-42d2-a5af-89953076f79e': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000558622', 'lastCheck': '9.5', 'valid': True}} Thread-9375::DEBUG::2017-05-02 18:02:34,119::task::595:: Storage.TaskManager.Task::(_updateState) Task=`ea63814e-d25d-4436-8bb5-cfc67af6498d`::moving from state preparing -> state finished Thread-9375::DEBUG::2017-05-02 18:02:34,119::resourceManager: :940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-9375::DEBUG::2017-05-02 18:02:34,119::resourceManager: :977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-9375::DEBUG::2017-05-02 18:02:34,119::task::993:: Storage.TaskManager.Task::(_decref) Task=`ea63814e-d25d-4436-8bb5-cfc67af6498d`::ref 0 aborting False Thread-9375::DEBUG::2017-05-02 18:02:34,122::stompReactor:: 163::yajsonrpc.StompServer::(send) Sending response JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:34,128::stompReactor:: 98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:34,129::__init__::506:: jsonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9376::DEBUG::2017-05-02 18:02:34,132::stompReactor:: 163::yajsonrpc.StompServer::(send) Sending response

On Tue, May 2, 2017 at 11:09 PM, Stefano Bovina <bovy89@gmail.com> wrote:
Hi, the engine logs show high latency on storage domains: "Storage domain <xxxx> experienced a high latency of 19.2814 seconds from ...... This may cause performance and functional issues."
Looking at host logs, I found also these locking errors:
2017-05-02 20:52:13+0200 33883 [10098]: s1 renewal error -202 delta_length 10 last_success 33853 2017-05-02 20:52:19+0200 33889 [10098]: 6a386652 aio collect 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe9fb000 result 1048576:0 other free 2017-05-02 21:08:51+0200 34880 [10098]: 6a386652 aio timeout 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe4f2000 ioto 10 to_count 24 2017-05-02 21:08:51+0200 34880 [10098]: s1 delta_renew read rv -202 offset 0 /dev/6a386652-629d-4045-835b-21d2f5c104aa/ids 2017-05-02 21:08:51+0200 34880 [10098]: s1 renewal error -202 delta_length 10 last_success 34850 2017-05-02 21:08:53+0200 34883 [10098]: 6a386652 aio collect 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe4f2000 result 1048576:0 other free 2017-05-02 21:30:40+0200 36189 [10098]: 6a386652 aio timeout 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe9fb000 ioto 10 to_count 25 2017-05-02 21:30:40+0200 36189 [10098]: s1 delta_renew read rv -202 offset 0 /dev/6a386652-629d-4045-835b-21d2f5c104aa/ids 2017-05-02 21:30:40+0200 36189 [10098]: s1 renewal error -202 delta_length 10 last_success 36159 2017-05-02 21:30:45+0200 36195 [10098]: 6a386652 aio collect 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe9fb000 result 1048576:0 other free
and this vdsm errors too:
Thread-22::ERROR::2017-05-02 21:53:48,147::sdc::137::Storag e.StorageDomainCache::(_findDomain) looking for unfetched domain f8f21d6c-2425-45c4-aded-4cb9b53ebd96 Thread-22::ERROR::2017-05-02 21:53:48,148::sdc::154::Storag e.StorageDomainCache::(_findUnfetchedDomain) looking for domain f8f21d6c-2425-45c4-aded-4cb9b53ebd96
Engine instead is showing this errors:
2017-05-02 21:40:38,089 ERROR [org.ovirt.engine.core.vdsbrok er.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-96) Command SpmStatusVDSCommand(HostName = <myhost.example.com>, HostId = dcc0275a-b011-4e33-bb95-366ffb0697b3, storagePoolId = 715d1ba2-eabe-48db-9aea-c28c30359808) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SpmStatusVDS, error = (-202, 'Sanlock resource read failure', 'Sanlock exception'), code = 100 2017-05-02 21:41:08,431 ERROR [org.ovirt.engine.core.vdsbrok er.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-53) [6e0d5ebf] Failed in SpmStatusVDS method 2017-05-02 21:41:08,443 ERROR [org.ovirt.engine.core.vdsbrok er.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-53) [6e0d5ebf] Command SpmStatusVDSCommand(HostName = <myhost.example.com>, HostId = 7991933e-5f30-48cd-88bf-b0b525613384, storagePoolId = 4bd73239-22d0-4c44-ab8c-17adcd580309) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SpmStatusVDS, error = (-202, 'Sanlock resource read failure', 'Sanlock exception'), code = 100 2017-05-02 21:41:31,975 ERROR [org.ovirt.engine.core.vdsbrok er.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-61) [2a54a1b2] Failed in SpmStatusVDS method 2017-05-02 21:41:31,987 ERROR [org.ovirt.engine.core.vdsbrok er.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-61) [2a54a1b2] Command SpmStatusVDSCommand(HostName = <myhost.example.com>, HostId = dcc0275a-b011-4e33-bb95-366ffb0697b3, storagePoolId = 715d1ba2-eabe-48db-9aea-c28c30359808) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SpmStatusVDS, error = (-202, 'Sanlock resource read failure', 'Sanlock exception'), code = 100
I'm using Fibre Channel or FCoE connectivity; storage array technical support has analyzed it (also switch and OS configurations), but nothing has been found.
Is this on a specific hosts, or multiple hosts? Is that FC or FCoE? Anything on the host's /var/log/messages?
Any advice?
Thanks
Installation info:
ovirt-release35-006-1.noarch
This is a very old release, I suggest, regardless of this issue, to upgrade. Y.
libgovirt-0.3.3-1.el7_2.1.x86_64 vdsm-4.16.30-0.el7.centos.x86_64 vdsm-xmlrpc-4.16.30-0.el7.centos.noarch vdsm-yajsonrpc-4.16.30-0.el7.centos.noarch vdsm-jsonrpc-4.16.30-0.el7.centos.noarch vdsm-python-zombiereaper-4.16.30-0.el7.centos.noarch vdsm-python-4.16.30-0.el7.centos.noarch vdsm-cli-4.16.30-0.el7.centos.noarch qemu-kvm-ev-2.3.0-29.1.el7.x86_64 qemu-kvm-common-ev-2.3.0-29.1.el7.x86_64 qemu-kvm-tools-ev-2.3.0-29.1.el7.x86_64 libvirt-client-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-storage-1.2.17-13.el7_2.3.x86_64 libvirt-python-1.2.17-2.el7.x86_64 libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.3.x86_64 libvirt-lock-sanlock-1.2.17-13.el7_2.3.x86_64 libvirt-glib-0.1.9-1.el7.x86_64 libvirt-daemon-driver-network-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-lxc-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-interface-1.2.17-13.el7_2.3.x86_64 libvirt-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-config-network-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-secret-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-kvm-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-qemu-1.2.17-13.el7_2.3.x86_64
------- vdsm.log (high latency)
Thread-21::DEBUG::2017-05-02 18:02:23,646::fileSD::261::Sto rage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '0+1 records in\n0+1 records out\n331 bytes (331 B) copied, 0.000285529 s, 1.2 MB/s\n'; <rc> = 0 Thread-18::DEBUG::2017-05-02 18:02:24,335::blockSD::596::St orage.Misc.excCmd::(getReadDelay) /bin/dd if=/dev/2c501858-bf8d-49a5-a42b-bca341b47827/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None) Thread-18::DEBUG::2017-05-02 18:02:24,343::blockSD::596::St orage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000489782 s, 8.4 MB/s\n'; <rc> = 0 Thread-19::DEBUG::2017-05-02 18:02:24,635::blockSD::596::St orage.Misc.excCmd::(getReadDelay) /bin/dd if=/dev/c5dc09c4-cd79-42d2-a5af-89953076f79e/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None) Thread-19::DEBUG::2017-05-02 18:02:24,643::blockSD::596::St orage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000558622 s, 7.3 MB/s\n'; <rc> = 0 Thread-22::DEBUG::2017-05-02 18:02:24,722::blockSD::596::St orage.Misc.excCmd::(getReadDelay) /bin/dd if=/dev/f8f21d6c-2425-45c4-aded-4cb9b53ebd96/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None) Thread-22::DEBUG::2017-05-02 18:02:24,729::blockSD::596::St orage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000632576 s, 6.5 MB/s\n'; <rc> = 0 JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:25,056::stompReactor::98 ::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:25,057::__init__::506::j sonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9372::DEBUG::2017-05-02 18:02:25,060::stompReactor::16 3::yajsonrpc.StompServer::(send) Sending response Thread-20::DEBUG::2017-05-02 18:02:25,895::blockSD::596::St orage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 19.2814 s, 0.2 kB/s\n'; <rc> = 0 JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:28,075::stompReactor::98 ::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:28,076::__init__::506::j sonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9373::DEBUG::2017-05-02 18:02:28,078::stompReactor::16 3::yajsonrpc.StompServer::(send) Sending response JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:31,094::stompReactor::98 ::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:31,095::__init__::506::j sonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9374::DEBUG::2017-05-02 18:02:31,097::stompReactor::16 3::yajsonrpc.StompServer::(send) Sending response Thread-21::DEBUG::2017-05-02 18:02:33,652::fileSD::261::Sto rage.Misc.excCmd::(getReadDelay) /bin/dd if=/rhev/data-center/mnt/<myho st.example.com>:_ISO/f105bdc6-efdc-445c-b49e-aa38c91c2569/dom_md/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None) Thread-21::DEBUG::2017-05-02 18:02:33,660::fileSD::261::Sto rage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '0+1 records in\n0+1 records out\n331 bytes (331 B) copied, 0.000246594 s, 1.3 MB/s\n'; <rc> = 0 JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:34,112::stompReactor::98 ::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:34,114::__init__::506::j sonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9375::DEBUG::2017-05-02 18:02:34,117::task::595::Stora ge.TaskManager.Task::(_updateState) Task=`ea63814e-d25d-4436-8bb5-cfc67af6498d`::moving from state init -> state preparing Thread-9375::INFO::2017-05-02 18:02:34,117::logUtils::44::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-9375::INFO::2017-05-02 18:02:34,118::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {'2c501858-bf8d-49a5-a42b-bca341b47827': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000489782', 'lastCheck': '9.8', 'valid': True}, 'f8f21d6c-2425-45c4-aded-4cb9b53ebd96': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000632576', 'lastCheck': '9.4', 'valid': True}, '6a386652-629d-4045-835b-21d2f5c104aa': {'code': 0, 'version': 3, 'acquired': True, 'delay': '19.2814', 'lastCheck': '8.2', 'valid': True}, 'f105bdc6-efdc-445c-b49e-aa38c91c2569': {'code': 0, 'version': 0, 'acquired': True, 'delay': '0.000246594', 'lastCheck': '0.5', 'valid': True}, 'c5dc09c4-cd79-42d2-a5af-89953076f79e': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000558622', 'lastCheck': '9.5', 'valid': True}} Thread-9375::DEBUG::2017-05-02 18:02:34,118::task::1191::Stor age.TaskManager.Task::(prepare) Task=`ea63814e-d25d-4436-8bb5-cfc67af6498d`::finished: {'2c501858-bf8d-49a5-a42b-bca341b47827': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000489782', 'lastCheck': '9.8', 'valid': True}, 'f8f21d6c-2425-45c4-aded-4cb9b53ebd96': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000632576', 'lastCheck': '9.4', 'valid': True}, '6a386652-629d-4045-835b-21d2f5c104aa': {'code': 0, 'version': 3, 'acquired': True, 'delay': '19.2814', 'lastCheck': '8.2', 'valid': True}, 'f105bdc6-efdc-445c-b49e-aa38c91c2569': {'code': 0, 'version': 0, 'acquired': True, 'delay': '0.000246594', 'lastCheck': '0.5', 'valid': True}, 'c5dc09c4-cd79-42d2-a5af-89953076f79e': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000558622', 'lastCheck': '9.5', 'valid': True}} Thread-9375::DEBUG::2017-05-02 18:02:34,119::task::595::Stora ge.TaskManager.Task::(_updateState) Task=`ea63814e-d25d-4436-8bb5-cfc67af6498d`::moving from state preparing -> state finished Thread-9375::DEBUG::2017-05-02 18:02:34,119::resourceManager: :940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-9375::DEBUG::2017-05-02 18:02:34,119::resourceManager: :977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-9375::DEBUG::2017-05-02 18:02:34,119::task::993::Storage.TaskManager.Task::(_decref) Task=`ea63814e-d25d-4436-8bb5-cfc67af6498d`::ref 0 aborting False Thread-9375::DEBUG::2017-05-02 18:02:34,122::stompReactor::16 3::yajsonrpc.StompServer::(send) Sending response JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:34,128::stompReactor::98 ::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:34,129::__init__::506::j sonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9376::DEBUG::2017-05-02 18:02:34,132::stompReactor::16 3::yajsonrpc.StompServer::(send) Sending response
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Hi, thanks for the advice. The upgrade is already scheduled, but I would like to fix this issue before proceeding with a big upgrade (unless an upgrade will fixes the problem). The problem is on all hypervisors. We have 2 cluster (both connected to the same storage system): - the old one with FC - the new one with FCoE With dmesg -T and looking at /var/log/messages we found several problems like these: 1) [Wed May 3 10:40:11 2017] sd 12:0:0:3: Parameters changed [Wed May 3 10:40:11 2017] sd 12:0:1:3: Parameters changed [Wed May 3 10:40:11 2017] sd 12:0:1:1: Parameters changed [Wed May 3 10:40:12 2017] sd 13:0:0:1: Parameters changed [Wed May 3 10:40:12 2017] sd 13:0:0:3: Parameters changed [Wed May 3 10:40:12 2017] sd 13:0:1:3: Parameters changed [Wed May 3 12:39:32 2017] device-mapper: multipath: Failing path 65:144. [Wed May 3 12:39:37 2017] sd 13:0:1:2: alua: port group 01 state A preferred supports tolUsNA 2) [Wed May 3 17:08:17 2017] perf interrupt took too long (2590 > 2500), lowering kernel.perf_event_max_sample_rate to 50000 3) [Wed May 3 19:16:21 2017] bnx2fc: els 0x5: tgt not ready [Wed May 3 19:16:21 2017] bnx2fc: Relogin to the tgt 4) sd 13:0:1:0: [sdx] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK sd 13:0:1:0: [sdx] CDB: Read(16) 88 00 00 00 00 00 00 58 08 00 00 00 04 00 00 00 blk_update_request: I/O error, dev sdx, sector 5769216 device-mapper: multipath: Failing path 65:112. sd 13:0:1:0: alua: port group 01 state A preferred supports tolUsNA 5) multipathd: 360060160a6213400cce46e40949de411: sdaa - emc_clariion_checker: Read error for WWN 60060160a6213400cce46e40949de411. Sense data are 0x0/0x0/0x0. multipathd: checker failed path 65:160 in map 360060160a6213400cce46e40949de411 multipathd: 360060160a6213400cce46e40949de411: remaining active paths: 3 kernel: device-mapper: multipath: Failing path 65:160. multipathd: 360060160a6213400cce46e40949de411: sdaa - emc_clariion_checker: Active path is healthy. multipathd: 65:160: reinstated multipathd: 360060160a6213400cce46e40949de411: remaining active paths: 4 6) [Sat May 6 11:37:07 2017] megaraid_sas 0000:02:00.0: Firmware crash dump is not available Multipath configuration is the following (recommended by EMC): # RHEV REVISION 1.1 # RHEV PRIVATE devices { device { vendor "DGC" product ".*" product_blacklist "LUNZ" path_grouping_policy group_by_prio path_selector "round-robin 0" path_checker emc_clariion features "1 queue_if_no_path" hardware_handler "1 alua" prio alua failback immediate rr_weight uniform no_path_retry 60 rr_min_io 1 } } Regards, Stefano 2017-05-07 8:36 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
On Tue, May 2, 2017 at 11:09 PM, Stefano Bovina <bovy89@gmail.com> wrote:
Hi, the engine logs show high latency on storage domains: "Storage domain <xxxx> experienced a high latency of 19.2814 seconds from ...... This may cause performance and functional issues."
Looking at host logs, I found also these locking errors:
2017-05-02 20:52:13+0200 33883 [10098]: s1 renewal error -202 delta_length 10 last_success 33853 2017-05-02 20:52:19+0200 33889 [10098]: 6a386652 aio collect 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe9fb000 result 1048576:0 other free 2017-05-02 21:08:51+0200 34880 [10098]: 6a386652 aio timeout 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe4f2000 ioto 10 to_count 24 2017-05-02 21:08:51+0200 34880 [10098]: s1 delta_renew read rv -202 offset 0 /dev/6a386652-629d-4045-835b-21d2f5c104aa/ids 2017-05-02 21:08:51+0200 34880 [10098]: s1 renewal error -202 delta_length 10 last_success 34850 2017-05-02 21:08:53+0200 34883 [10098]: 6a386652 aio collect 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe4f2000 result 1048576:0 other free 2017-05-02 21:30:40+0200 36189 [10098]: 6a386652 aio timeout 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe9fb000 ioto 10 to_count 25 2017-05-02 21:30:40+0200 36189 [10098]: s1 delta_renew read rv -202 offset 0 /dev/6a386652-629d-4045-835b-21d2f5c104aa/ids 2017-05-02 21:30:40+0200 36189 [10098]: s1 renewal error -202 delta_length 10 last_success 36159 2017-05-02 21:30:45+0200 36195 [10098]: 6a386652 aio collect 0 0x7f1fb80008c0:0x7f1fb80008d0:0x7f1fbe9fb000 result 1048576:0 other free
and this vdsm errors too:
Thread-22::ERROR::2017-05-02 21:53:48,147::sdc::137::Storag e.StorageDomainCache::(_findDomain) looking for unfetched domain f8f21d6c-2425-45c4-aded-4cb9b53ebd96 Thread-22::ERROR::2017-05-02 21:53:48,148::sdc::154::Storag e.StorageDomainCache::(_findUnfetchedDomain) looking for domain f8f21d6c-2425-45c4-aded-4cb9b53ebd96
Engine instead is showing this errors:
2017-05-02 21:40:38,089 ERROR [org.ovirt.engine.core.vdsbrok er.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-96) Command SpmStatusVDSCommand(HostName = <myhost.example.com>, HostId = dcc0275a-b011-4e33-bb95-366ffb0697b3, storagePoolId = 715d1ba2-eabe-48db-9aea-c28c30359808) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SpmStatusVDS, error = (-202, 'Sanlock resource read failure', 'Sanlock exception'), code = 100 2017-05-02 21:41:08,431 ERROR [org.ovirt.engine.core.vdsbrok er.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-53) [6e0d5ebf] Failed in SpmStatusVDS method 2017-05-02 21:41:08,443 ERROR [org.ovirt.engine.core.vdsbrok er.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-53) [6e0d5ebf] Command SpmStatusVDSCommand(HostName = <myhost.example.com>, HostId = 7991933e-5f30-48cd-88bf-b0b525613384, storagePoolId = 4bd73239-22d0-4c44-ab8c-17adcd580309) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SpmStatusVDS, error = (-202, 'Sanlock resource read failure', 'Sanlock exception'), code = 100 2017-05-02 21:41:31,975 ERROR [org.ovirt.engine.core.vdsbrok er.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-61) [2a54a1b2] Failed in SpmStatusVDS method 2017-05-02 21:41:31,987 ERROR [org.ovirt.engine.core.vdsbrok er.vdsbroker.SpmStatusVDSCommand] (DefaultQuartzScheduler_Worker-61) [2a54a1b2] Command SpmStatusVDSCommand(HostName = <myhost.example.com>, HostId = dcc0275a-b011-4e33-bb95-366ffb0697b3, storagePoolId = 715d1ba2-eabe-48db-9aea-c28c30359808) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to SpmStatusVDS, error = (-202, 'Sanlock resource read failure', 'Sanlock exception'), code = 100
I'm using Fibre Channel or FCoE connectivity; storage array technical support has analyzed it (also switch and OS configurations), but nothing has been found.
Is this on a specific hosts, or multiple hosts? Is that FC or FCoE? Anything on the host's /var/log/messages?
Any advice?
Thanks
Installation info:
ovirt-release35-006-1.noarch
This is a very old release, I suggest, regardless of this issue, to upgrade. Y.
libgovirt-0.3.3-1.el7_2.1.x86_64 vdsm-4.16.30-0.el7.centos.x86_64 vdsm-xmlrpc-4.16.30-0.el7.centos.noarch vdsm-yajsonrpc-4.16.30-0.el7.centos.noarch vdsm-jsonrpc-4.16.30-0.el7.centos.noarch vdsm-python-zombiereaper-4.16.30-0.el7.centos.noarch vdsm-python-4.16.30-0.el7.centos.noarch vdsm-cli-4.16.30-0.el7.centos.noarch qemu-kvm-ev-2.3.0-29.1.el7.x86_64 qemu-kvm-common-ev-2.3.0-29.1.el7.x86_64 qemu-kvm-tools-ev-2.3.0-29.1.el7.x86_64 libvirt-client-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-storage-1.2.17-13.el7_2.3.x86_64 libvirt-python-1.2.17-2.el7.x86_64 libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.3.x86_64 libvirt-lock-sanlock-1.2.17-13.el7_2.3.x86_64 libvirt-glib-0.1.9-1.el7.x86_64 libvirt-daemon-driver-network-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-lxc-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-interface-1.2.17-13.el7_2.3.x86_64 libvirt-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-config-network-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-secret-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-config-nwfilter-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-kvm-1.2.17-13.el7_2.3.x86_64 libvirt-daemon-driver-qemu-1.2.17-13.el7_2.3.x86_64
------- vdsm.log (high latency)
Thread-21::DEBUG::2017-05-02 18:02:23,646::fileSD::261::Sto rage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '0+1 records in\n0+1 records out\n331 bytes (331 B) copied, 0.000285529 s, 1.2 MB/s\n'; <rc> = 0 Thread-18::DEBUG::2017-05-02 18:02:24,335::blockSD::596::St orage.Misc.excCmd::(getReadDelay) /bin/dd if=/dev/2c501858-bf8d-49a5-a42b-bca341b47827/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None) Thread-18::DEBUG::2017-05-02 18:02:24,343::blockSD::596::St orage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000489782 s, 8.4 MB/s\n'; <rc> = 0 Thread-19::DEBUG::2017-05-02 18:02:24,635::blockSD::596::St orage.Misc.excCmd::(getReadDelay) /bin/dd if=/dev/c5dc09c4-cd79-42d2-a5af-89953076f79e/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None) Thread-19::DEBUG::2017-05-02 18:02:24,643::blockSD::596::St orage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000558622 s, 7.3 MB/s\n'; <rc> = 0 Thread-22::DEBUG::2017-05-02 18:02:24,722::blockSD::596::St orage.Misc.excCmd::(getReadDelay) /bin/dd if=/dev/f8f21d6c-2425-45c4-aded-4cb9b53ebd96/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None) Thread-22::DEBUG::2017-05-02 18:02:24,729::blockSD::596::St orage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000632576 s, 6.5 MB/s\n'; <rc> = 0 JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:25,056::stompReactor::98 ::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:25,057::__init__::506::j sonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9372::DEBUG::2017-05-02 18:02:25,060::stompReactor::16 3::yajsonrpc.StompServer::(send) Sending response Thread-20::DEBUG::2017-05-02 18:02:25,895::blockSD::596::St orage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 19.2814 s, 0.2 kB/s\n'; <rc> = 0 JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:28,075::stompReactor::98 ::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:28,076::__init__::506::j sonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9373::DEBUG::2017-05-02 18:02:28,078::stompReactor::16 3::yajsonrpc.StompServer::(send) Sending response JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:31,094::stompReactor::98 ::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:31,095::__init__::506::j sonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9374::DEBUG::2017-05-02 18:02:31,097::stompReactor::16 3::yajsonrpc.StompServer::(send) Sending response Thread-21::DEBUG::2017-05-02 18:02:33,652::fileSD::261::Sto rage.Misc.excCmd::(getReadDelay) /bin/dd if=/rhev/data-center/mnt/<myho st.example.com>:_ISO/f105bdc6-efdc-445c-b49e-aa38c91c2569/dom_md/metadata iflag=direct of=/dev/null bs=4096 count=1 (cwd None) Thread-21::DEBUG::2017-05-02 18:02:33,660::fileSD::261::Sto rage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '0+1 records in\n0+1 records out\n331 bytes (331 B) copied, 0.000246594 s, 1.3 MB/s\n'; <rc> = 0 JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:34,112::stompReactor::98 ::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:34,114::__init__::506::j sonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9375::DEBUG::2017-05-02 18:02:34,117::task::595::Stora ge.TaskManager.Task::(_updateState) Task=`ea63814e-d25d-4436-8bb5-cfc67af6498d`::moving from state init -> state preparing Thread-9375::INFO::2017-05-02 18:02:34,117::logUtils::44::dispatcher::(wrapper) Run and protect: repoStats(options=None) Thread-9375::INFO::2017-05-02 18:02:34,118::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {'2c501858-bf8d-49a5-a42b-bca341b47827': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000489782', 'lastCheck': '9.8', 'valid': True}, 'f8f21d6c-2425-45c4-aded-4cb9b53ebd96': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000632576', 'lastCheck': '9.4', 'valid': True}, '6a386652-629d-4045-835b-21d2f5c104aa': {'code': 0, 'version': 3, 'acquired': True, 'delay': '19.2814', 'lastCheck': '8.2', 'valid': True}, 'f105bdc6-efdc-445c-b49e-aa38c91c2569': {'code': 0, 'version': 0, 'acquired': True, 'delay': '0.000246594', 'lastCheck': '0.5', 'valid': True}, 'c5dc09c4-cd79-42d2-a5af-89953076f79e': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000558622', 'lastCheck': '9.5', 'valid': True}} Thread-9375::DEBUG::2017-05-02 18:02:34,118::task::1191::Stor age.TaskManager.Task::(prepare) Task=`ea63814e-d25d-4436-8bb5-cfc67af6498d`::finished: {'2c501858-bf8d-49a5-a42b-bca341b47827': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000489782', 'lastCheck': '9.8', 'valid': True}, 'f8f21d6c-2425-45c4-aded-4cb9b53ebd96': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000632576', 'lastCheck': '9.4', 'valid': True}, '6a386652-629d-4045-835b-21d2f5c104aa': {'code': 0, 'version': 3, 'acquired': True, 'delay': '19.2814', 'lastCheck': '8.2', 'valid': True}, 'f105bdc6-efdc-445c-b49e-aa38c91c2569': {'code': 0, 'version': 0, 'acquired': True, 'delay': '0.000246594', 'lastCheck': '0.5', 'valid': True}, 'c5dc09c4-cd79-42d2-a5af-89953076f79e': {'code': 0, 'version': 3, 'acquired': True, 'delay': '0.000558622', 'lastCheck': '9.5', 'valid': True}} Thread-9375::DEBUG::2017-05-02 18:02:34,119::task::595::Stora ge.TaskManager.Task::(_updateState) Task=`ea63814e-d25d-4436-8bb5-cfc67af6498d`::moving from state preparing -> state finished Thread-9375::DEBUG::2017-05-02 18:02:34,119::resourceManager: :940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-9375::DEBUG::2017-05-02 18:02:34,119::resourceManager: :977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-9375::DEBUG::2017-05-02 18:02:34,119::task::993::Storage.TaskManager.Task::(_decref) Task=`ea63814e-d25d-4436-8bb5-cfc67af6498d`::ref 0 aborting False Thread-9375::DEBUG::2017-05-02 18:02:34,122::stompReactor::16 3::yajsonrpc.StompServer::(send) Sending response JsonRpc (StompReactor)::DEBUG::2017-05-02 18:02:34,128::stompReactor::98 ::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2017-05-02 18:02:34,129::__init__::506::j sonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-9376::DEBUG::2017-05-02 18:02:34,132::stompReactor::16 3::yajsonrpc.StompServer::(send) Sending response
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

On Sun, May 7, 2017 at 1:27 PM, Stefano Bovina <bovy89@gmail.com> wrote:
Sense data are 0x0/0x0/0x0
Interesting - first time I'm seeing 0/0/0. The 1st is usually 0x2 (see [1]), and then the rest [2], [3] make sense. A google search found another user with Clarion with the exact same error[4], so I'm leaning toward misconfiguration of multipathing/clarion here. Is your multipathing configuration working well for you? Are you sure it's a EL7 configuration? For example, I believe you should have rr_min_io_rq and not rr_min_io . Y. [1] http://www.t10.org/lists/2status.htm [2] http://www.t10.org/lists/2sensekey.htm [3] http://www.t10.org/lists/asc-num.htm [4] http://www.linuxquestions.org/questions/centos-111/ multipath-problems-4175544908/

Yes, this configuration is the one suggested by EMC for EL7. By the way, "The parameters rr_min_io vs. rr_min_io_rq mean the same thing but are used for device-mapper-multipath on differing kernel versions." and rr_min_io_rq default value is 1, rr_min_io default value is 1000, so it should be fine. 2017-05-08 9:39 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
On Sun, May 7, 2017 at 1:27 PM, Stefano Bovina <bovy89@gmail.com> wrote:
Sense data are 0x0/0x0/0x0
Interesting - first time I'm seeing 0/0/0. The 1st is usually 0x2 (see [1]), and then the rest [2], [3] make sense.
A google search found another user with Clarion with the exact same error[4], so I'm leaning toward misconfiguration of multipathing/clarion here.
Is your multipathing configuration working well for you? Are you sure it's a EL7 configuration? For example, I believe you should have rr_min_io_rq and not rr_min_io . Y.
[1] http://www.t10.org/lists/2status.htm [2] http://www.t10.org/lists/2sensekey.htm [3] http://www.t10.org/lists/asc-num.htm [4] http://www.linuxquestions.org/questions/centos-111/multi path-problems-4175544908/

On Mon, May 8, 2017 at 11:50 AM, Stefano Bovina <bovy89@gmail.com> wrote:
Yes, this configuration is the one suggested by EMC for EL7.
https://access.redhat.com/solutions/139193 suggest that for alua, the patch checker needs to be different. Anyway, it is very likely that you have storage issues - they need to be resolved first and I believe they have little to do with oVirt at the moment. Y.
By the way, "The parameters rr_min_io vs. rr_min_io_rq mean the same thing but are used for device-mapper-multipath on differing kernel versions." and rr_min_io_rq default value is 1, rr_min_io default value is 1000, so it should be fine.
2017-05-08 9:39 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
On Sun, May 7, 2017 at 1:27 PM, Stefano Bovina <bovy89@gmail.com> wrote:
Sense data are 0x0/0x0/0x0
Interesting - first time I'm seeing 0/0/0. The 1st is usually 0x2 (see [1]), and then the rest [2], [3] make sense.
A google search found another user with Clarion with the exact same error[4], so I'm leaning toward misconfiguration of multipathing/clarion here.
Is your multipathing configuration working well for you? Are you sure it's a EL7 configuration? For example, I believe you should have rr_min_io_rq and not rr_min_io . Y.
[1] http://www.t10.org/lists/2status.htm [2] http://www.t10.org/lists/2sensekey.htm [3] http://www.t10.org/lists/asc-num.htm [4] http://www.linuxquestions.org/questions/centos-111/multi path-problems-4175544908/

Hi, a little update: The command multipath -ll hung when executed on the host while the problem occur (nothing logged in /var/log/messages or dmesg). I tested latency with ioping: ioping /dev/6a386652-629d-4045-835b-21d2f5c104aa/metadata Usually it return "time=15.6 ms", sometimes return "time=19 s" (yes, seconds) Systems are up to date and I tried both path_checker (emc_clariion and directio), without results. (https://access.redhat.com/solutions/139193, it refers to the Rev A31 of EMC document; last is A42 and suggest emc_clariion). Any idea or suggestion? Thanks, Stefano 2017-05-08 11:56 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
On Mon, May 8, 2017 at 11:50 AM, Stefano Bovina <bovy89@gmail.com> wrote:
Yes, this configuration is the one suggested by EMC for EL7.
https://access.redhat.com/solutions/139193 suggest that for alua, the patch checker needs to be different.
Anyway, it is very likely that you have storage issues - they need to be resolved first and I believe they have little to do with oVirt at the moment. Y.
By the way, "The parameters rr_min_io vs. rr_min_io_rq mean the same thing but are used for device-mapper-multipath on differing kernel versions." and rr_min_io_rq default value is 1, rr_min_io default value is 1000, so it should be fine.
2017-05-08 9:39 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
On Sun, May 7, 2017 at 1:27 PM, Stefano Bovina <bovy89@gmail.com> wrote:
Sense data are 0x0/0x0/0x0
Interesting - first time I'm seeing 0/0/0. The 1st is usually 0x2 (see [1]), and then the rest [2], [3] make sense.
A google search found another user with Clarion with the exact same error[4], so I'm leaning toward misconfiguration of multipathing/clarion here.
Is your multipathing configuration working well for you? Are you sure it's a EL7 configuration? For example, I believe you should have rr_min_io_rq and not rr_min_io . Y.
[1] http://www.t10.org/lists/2status.htm [2] http://www.t10.org/lists/2sensekey.htm [3] http://www.t10.org/lists/asc-num.htm [4] http://www.linuxquestions.org/questions/centos-111/multi path-problems-4175544908/

sorry to jump in, but what kernel version are you using? had similar issue with kernel 4.10/4.11 2017-05-12 16:36 GMT-03:00 Stefano Bovina <bovy89@gmail.com>:
Hi, a little update:
The command multipath -ll hung when executed on the host while the problem occur (nothing logged in /var/log/messages or dmesg).
I tested latency with ioping: ioping /dev/6a386652-629d-4045-835b-21d2f5c104aa/metadata
Usually it return "time=15.6 ms", sometimes return "time=19 s" (yes, seconds)
Systems are up to date and I tried both path_checker (emc_clariion and directio), without results. (https://access.redhat.com/solutions/139193, it refers to the Rev A31 of EMC document; last is A42 and suggest emc_clariion).
Any idea or suggestion?
Thanks,
Stefano
2017-05-08 11:56 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
On Mon, May 8, 2017 at 11:50 AM, Stefano Bovina <bovy89@gmail.com> wrote:
Yes, this configuration is the one suggested by EMC for EL7.
https://access.redhat.com/solutions/139193 suggest that for alua, the patch checker needs to be different.
Anyway, it is very likely that you have storage issues - they need to be resolved first and I believe they have little to do with oVirt at the moment. Y.
By the way, "The parameters rr_min_io vs. rr_min_io_rq mean the same thing but are used for device-mapper-multipath on differing kernel versions." and rr_min_io_rq default value is 1, rr_min_io default value is 1000, so it should be fine.
2017-05-08 9:39 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
On Sun, May 7, 2017 at 1:27 PM, Stefano Bovina <bovy89@gmail.com> wrote:
Sense data are 0x0/0x0/0x0
Interesting - first time I'm seeing 0/0/0. The 1st is usually 0x2 (see [1]), and then the rest [2], [3] make sense.
A google search found another user with Clarion with the exact same error[4], so I'm leaning toward misconfiguration of multipathing/clarion here.
Is your multipathing configuration working well for you? Are you sure it's a EL7 configuration? For example, I believe you should have rr_min_io_rq and not rr_min_io . Y.
[1] http://www.t10.org/lists/2status.htm [2] http://www.t10.org/lists/2sensekey.htm [3] http://www.t10.org/lists/asc-num.htm [4] http://www.linuxquestions.org/questions/centos-111/multi path-problems-4175544908/
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Hi, 2.6.32-696.1.1.el6.x86_64 3.10.0-514.10.2.el7.x86_64 I tried ioping test from different group of servers using multipath, members of different storage group (different lun, different raid ecc), and everyone report latency. I tried the same test (ioping) on a server with powerpath instead of multipath, with a dedicated raid group and ioping don't report latency. 2017-05-13 2:00 GMT+02:00 Juan Pablo <pablo.localhost@gmail.com>:
sorry to jump in, but what kernel version are you using? had similar issue with kernel 4.10/4.11
2017-05-12 16:36 GMT-03:00 Stefano Bovina <bovy89@gmail.com>:
Hi, a little update:
The command multipath -ll hung when executed on the host while the problem occur (nothing logged in /var/log/messages or dmesg).
I tested latency with ioping: ioping /dev/6a386652-629d-4045-835b-21d2f5c104aa/metadata
Usually it return "time=15.6 ms", sometimes return "time=19 s" (yes, seconds)
Systems are up to date and I tried both path_checker (emc_clariion and directio), without results. (https://access.redhat.com/solutions/139193, it refers to the Rev A31 of EMC document; last is A42 and suggest emc_clariion).
Any idea or suggestion?
Thanks,
Stefano
2017-05-08 11:56 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
On Mon, May 8, 2017 at 11:50 AM, Stefano Bovina <bovy89@gmail.com> wrote:
Yes, this configuration is the one suggested by EMC for EL7.
https://access.redhat.com/solutions/139193 suggest that for alua, the patch checker needs to be different.
Anyway, it is very likely that you have storage issues - they need to be resolved first and I believe they have little to do with oVirt at the moment. Y.
By the way, "The parameters rr_min_io vs. rr_min_io_rq mean the same thing but are used for device-mapper-multipath on differing kernel versions." and rr_min_io_rq default value is 1, rr_min_io default value is 1000, so it should be fine.
2017-05-08 9:39 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
On Sun, May 7, 2017 at 1:27 PM, Stefano Bovina <bovy89@gmail.com> wrote:
Sense data are 0x0/0x0/0x0
Interesting - first time I'm seeing 0/0/0. The 1st is usually 0x2 (see [1]), and then the rest [2], [3] make sense.
A google search found another user with Clarion with the exact same error[4], so I'm leaning toward misconfiguration of multipathing/clarion here.
Is your multipathing configuration working well for you? Are you sure it's a EL7 configuration? For example, I believe you should have rr_min_io_rq and not rr_min_io . Y.
[1] http://www.t10.org/lists/2status.htm [2] http://www.t10.org/lists/2sensekey.htm [3] http://www.t10.org/lists/asc-num.htm [4] http://www.linuxquestions.org/questions/centos-111/multi path-problems-4175544908/
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

can you please give the output of: multipath -ll and iscsiadm -m session -P3 JP 2017-05-13 6:48 GMT-03:00 Stefano Bovina <bovy89@gmail.com>:
Hi,
2.6.32-696.1.1.el6.x86_64 3.10.0-514.10.2.el7.x86_64
I tried ioping test from different group of servers using multipath, members of different storage group (different lun, different raid ecc), and everyone report latency. I tried the same test (ioping) on a server with powerpath instead of multipath, with a dedicated raid group and ioping don't report latency.
2017-05-13 2:00 GMT+02:00 Juan Pablo <pablo.localhost@gmail.com>:
sorry to jump in, but what kernel version are you using? had similar issue with kernel 4.10/4.11
2017-05-12 16:36 GMT-03:00 Stefano Bovina <bovy89@gmail.com>:
Hi, a little update:
The command multipath -ll hung when executed on the host while the problem occur (nothing logged in /var/log/messages or dmesg).
I tested latency with ioping: ioping /dev/6a386652-629d-4045-835b-21d2f5c104aa/metadata
Usually it return "time=15.6 ms", sometimes return "time=19 s" (yes, seconds)
Systems are up to date and I tried both path_checker (emc_clariion and directio), without results. (https://access.redhat.com/solutions/139193, it refers to the Rev A31 of EMC document; last is A42 and suggest emc_clariion).
Any idea or suggestion?
Thanks,
Stefano
2017-05-08 11:56 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
On Mon, May 8, 2017 at 11:50 AM, Stefano Bovina <bovy89@gmail.com> wrote:
Yes, this configuration is the one suggested by EMC for EL7.
https://access.redhat.com/solutions/139193 suggest that for alua, the patch checker needs to be different.
Anyway, it is very likely that you have storage issues - they need to be resolved first and I believe they have little to do with oVirt at the moment. Y.
By the way, "The parameters rr_min_io vs. rr_min_io_rq mean the same thing but are used for device-mapper-multipath on differing kernel versions." and rr_min_io_rq default value is 1, rr_min_io default value is 1000, so it should be fine.
2017-05-08 9:39 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
On Sun, May 7, 2017 at 1:27 PM, Stefano Bovina <bovy89@gmail.com> wrote:
> Sense data are 0x0/0x0/0x0
Interesting - first time I'm seeing 0/0/0. The 1st is usually 0x2 (see [1]), and then the rest [2], [3] make sense.
A google search found another user with Clarion with the exact same error[4], so I'm leaning toward misconfiguration of multipathing/clarion here.
Is your multipathing configuration working well for you? Are you sure it's a EL7 configuration? For example, I believe you should have rr_min_io_rq and not rr_min_io . Y.
[1] http://www.t10.org/lists/2status.htm [2] http://www.t10.org/lists/2sensekey.htm [3] http://www.t10.org/lists/asc-num.htm [4] http://www.linuxquestions.org/questions/centos-111/multi path-problems-4175544908/
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

It's FC/FcoE. This is with configuration suggested by emc/redhat 360060160a62134002818778f949de411 dm-5 DGC,VRAID size=11T features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=50 status=active | |- 1:0:1:2 sdr 65:16 active ready running | `- 2:0:1:2 sdy 65:128 active ready running `-+- policy='round-robin 0' prio=10 status=enabled |- 1:0:0:2 sdd 8:48 active ready running `- 2:0:0:2 sdk 8:160 active ready running 360060160a6213400e622de69949de411 dm-2 DGC,VRAID size=6.0T features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=50 status=active | |- 1:0:1:0 sdp 8:240 active ready running | `- 2:0:1:0 sdw 65:96 active ready running `-+- policy='round-robin 0' prio=10 status=enabled |- 1:0:0:0 sdb 8:16 active ready running `- 2:0:0:0 sdi 8:128 active ready running 360060160a6213400cce46e40949de411 dm-4 DGC,VRAID size=560G features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=50 status=active | |- 1:0:1:3 sds 65:32 active ready running | `- 2:0:1:3 sdz 65:144 active ready running `-+- policy='round-robin 0' prio=10 status=enabled |- 1:0:0:3 sde 8:64 active ready running `- 2:0:0:3 sdl 8:176 active ready running 360060160a6213400c4b39e80949de411 dm-3 DGC,VRAID size=500G features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=50 status=active | |- 1:0:1:1 sdq 65:0 active ready running | `- 2:0:1:1 sdx 65:112 active ready running `-+- policy='round-robin 0' prio=10 status=enabled |- 1:0:0:1 sdc 8:32 active ready running `- 2:0:0:1 sdj 8:144 active ready running 360060160a6213400fa2d31acbbfce511 dm-8 DGC,RAID 5 size=5.4T features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=50 status=active | |- 1:0:0:6 sdh 8:112 active ready running | `- 2:0:0:6 sdo 8:224 active ready running `-+- policy='round-robin 0' prio=10 status=enabled |- 1:0:1:6 sdv 65:80 active ready running `- 2:0:1:6 sdac 65:192 active ready running 360060160a621340040652b7582f5e511 dm-7 DGC,RAID 5 size=3.6T features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=50 status=active | |- 1:0:0:4 sdf 8:80 active ready running | `- 2:0:0:4 sdm 8:192 active ready running `-+- policy='round-robin 0' prio=10 status=enabled |- 1:0:1:4 sdt 65:48 active ready running `- 2:0:1:4 sdaa 65:160 active ready running 360060160a621340064b1034cbbfce511 dm-6 DGC,RAID 5 size=1.0T features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=50 status=active | |- 1:0:1:5 sdu 65:64 active ready running | `- 2:0:1:5 sdab 65:176 active ready running `-+- policy='round-robin 0' prio=10 status=enabled |- 1:0:0:5 sdg 8:96 active ready running `- 2:0:0:5 sdn 8:208 active ready running This is with ovirt default conf: 360060160a6213400848e60af82f5e511 dm-3 DGC ,RAID 5 size=3.6T features='1 retain_attached_hw_handler' hwhandler='1 alua' wp=rw |-+- policy='service-time 0' prio=50 status=active | |- 12:0:0:4 sdj 8:144 active ready running | `- 13:0:1:4 sdd 8:48 active ready running `-+- policy='service-time 0' prio=10 status=enabled |- 12:0:1:4 sdf 8:80 active ready running `- 13:0:0:4 sdh 8:112 active ready running 360060160a6213400005e425b6b10e611 dm-2 DGC ,RAID 10 size=4.2T features='1 retain_attached_hw_handler' hwhandler='1 alua' wp=rw |-+- policy='service-time 0' prio=50 status=active | |- 12:0:1:0 sde 8:64 active ready running | `- 13:0:0:0 sdg 8:96 active ready running `-+- policy='service-time 0' prio=10 status=enabled |- 13:0:1:0 sdc 8:32 active ready running `- 12:0:0:0 sdi 8:128 active ready running 2017-05-13 18:50 GMT+02:00 Juan Pablo <pablo.localhost@gmail.com>:
can you please give the output of: multipath -ll and iscsiadm -m session -P3
JP
2017-05-13 6:48 GMT-03:00 Stefano Bovina <bovy89@gmail.com>:
Hi,
2.6.32-696.1.1.el6.x86_64 3.10.0-514.10.2.el7.x86_64
I tried ioping test from different group of servers using multipath, members of different storage group (different lun, different raid ecc), and everyone report latency. I tried the same test (ioping) on a server with powerpath instead of multipath, with a dedicated raid group and ioping don't report latency.
2017-05-13 2:00 GMT+02:00 Juan Pablo <pablo.localhost@gmail.com>:
sorry to jump in, but what kernel version are you using? had similar issue with kernel 4.10/4.11
2017-05-12 16:36 GMT-03:00 Stefano Bovina <bovy89@gmail.com>:
Hi, a little update:
The command multipath -ll hung when executed on the host while the problem occur (nothing logged in /var/log/messages or dmesg).
I tested latency with ioping: ioping /dev/6a386652-629d-4045-835b-21d2f5c104aa/metadata
Usually it return "time=15.6 ms", sometimes return "time=19 s" (yes, seconds)
Systems are up to date and I tried both path_checker (emc_clariion and directio), without results. (https://access.redhat.com/solutions/139193, it refers to the Rev A31 of EMC document; last is A42 and suggest emc_clariion).
Any idea or suggestion?
Thanks,
Stefano
2017-05-08 11:56 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
On Mon, May 8, 2017 at 11:50 AM, Stefano Bovina <bovy89@gmail.com> wrote:
Yes, this configuration is the one suggested by EMC for EL7.
https://access.redhat.com/solutions/139193 suggest that for alua, the patch checker needs to be different.
Anyway, it is very likely that you have storage issues - they need to be resolved first and I believe they have little to do with oVirt at the moment. Y.
By the way, "The parameters rr_min_io vs. rr_min_io_rq mean the same thing but are used for device-mapper-multipath on differing kernel versions." and rr_min_io_rq default value is 1, rr_min_io default value is 1000, so it should be fine.
2017-05-08 9:39 GMT+02:00 Yaniv Kaul <ykaul@redhat.com>:
> > On Sun, May 7, 2017 at 1:27 PM, Stefano Bovina <bovy89@gmail.com> > wrote: > >> Sense data are 0x0/0x0/0x0 > > > Interesting - first time I'm seeing 0/0/0. The 1st is usually 0x2 > (see [1]), and then the rest [2], [3] make sense. > > A google search found another user with Clarion with the exact same > error[4], so I'm leaning toward misconfiguration of multipathing/clarion > here. > > Is your multipathing configuration working well for you? > Are you sure it's a EL7 configuration? For example, I believe you > should have rr_min_io_rq and not rr_min_io . > Y. > > [1] http://www.t10.org/lists/2status.htm > [2] http://www.t10.org/lists/2sensekey.htm > [3] http://www.t10.org/lists/asc-num.htm > [4] http://www.linuxquestions.org/questions/centos-111/multi > path-problems-4175544908/ >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
participants (3)
-
Juan Pablo
-
Stefano Bovina
-
Yaniv Kaul