oVirt 4.3.5 potential issue with NFS storage

Dear oVIrt, This is my third oVirt platform in the company, but first time I am seeing following logs: “2019-08-07 16:00:16,099Z INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-51) [1b85e637] Lock freed to object 'EngineLock:{exclusiveLocks='[2350ee82-94ed-4f90-9366-451e0104d1d6=PROVIDER]', sharedLocks=''}' 2019-08-07 16:00:25,618Z WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37723) [] domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' in problem 'PROBLEMATIC'. vds: 'ovirt-sj-05.ictv.com' 2019-08-07 16:00:40,630Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37735) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-05.ictv.com' 2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-01.ictv.com' 2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' has recovered from problem. No active host in the DC is reporting it as problematic, so clearing the domain recovery timer.” Can you help me understanding why is this being reported? This setup is: 5HOSTS, 3 in HA SelfHostedEngine Version 4.3.5 NFS based Netapp storage, version 4.1 “10.210.13.64:/ovirt_hosted_engine on /rhev/data-center/mnt/10.210.13.64:_ovirt__hosted__engine type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64) 10.210.13.64:/ovirt_production on /rhev/data-center/mnt/10.210.13.64:_ovirt__production type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64) tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=9878396k,mode=700)” First mount is SHE dedicated storage. Second mount “ovirt_produciton” is for other VM Guests. Kindly awaiting your reply. Marko Vrgotic

Log line form VDSM: “[root@ovirt-sj-05 ~]# tail -f /var/log/vdsm/vdsm.log | grep WARN 2019-08-07 09:40:03,556-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282) 2019-08-07 09:40:47,132-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445) 2019-08-07 09:44:53,564-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282) 2019-08-07 09:46:38,604-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445)” From: "Vrgotic, Marko" <M.Vrgotic@activevideo.com> Date: Wednesday, 7 August 2019 at 09:09 To: "users@ovirt.org" <users@ovirt.org> Subject: oVirt 4.3.5 potential issue with NFS storage Dear oVIrt, This is my third oVirt platform in the company, but first time I am seeing following logs: “2019-08-07 16:00:16,099Z INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-51) [1b85e637] Lock freed to object 'EngineLock:{exclusiveLocks='[2350ee82-94ed-4f90-9366-451e0104d1d6=PROVIDER]', sharedLocks=''}' 2019-08-07 16:00:25,618Z WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37723) [] domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' in problem 'PROBLEMATIC'. vds: 'ovirt-sj-05.ictv.com' 2019-08-07 16:00:40,630Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37735) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-05.ictv.com' 2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-01.ictv.com' 2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' has recovered from problem. No active host in the DC is reporting it as problematic, so clearing the domain recovery timer.” Can you help me understanding why is this being reported? This setup is: 5HOSTS, 3 in HA SelfHostedEngine Version 4.3.5 NFS based Netapp storage, version 4.1 “10.210.13.64:/ovirt_hosted_engine on /rhev/data-center/mnt/10.210.13.64:_ovirt__hosted__engine type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64) 10.210.13.64:/ovirt_production on /rhev/data-center/mnt/10.210.13.64:_ovirt__production type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64) tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=9878396k,mode=700)” First mount is SHE dedicated storage. Second mount “ovirt_produciton” is for other VM Guests. Kindly awaiting your reply. Marko Vrgotic

Another one that seem to be related: 2019-08-07 14:43:59,069-0700 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/10.210.13.64:_ovirt__production/6effda5e-1a0d-4312-bf93-d97fa9eb5aee/dom_md/metadata (monitor:499) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 497, in _pathChecked delay = result.delay() File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line 391, in delay raise exception.MiscFileReadException(self.path, self.rc, self.err) MiscFileReadException: Internal file read failure: (u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/6effda5e-1a0d-4312-bf93-d97fa9eb5aee/dom_md/metadata', 1, 'Read timeout') 2019-08-07 14:43:59,116-0700 WARN (monitor/6effda5) [storage.Monitor] Host id for domain 6effda5e-1a0d-4312-bf93-d97fa9eb5aee was released (id: 1) (monitor:445) From: "Vrgotic, Marko" <M.Vrgotic@activevideo.com> Date: Wednesday, 7 August 2019 at 09:50 To: "users@ovirt.org" <users@ovirt.org> Subject: Re: oVirt 4.3.5 potential issue with NFS storage Log line form VDSM: “[root@ovirt-sj-05 ~]# tail -f /var/log/vdsm/vdsm.log | grep WARN 2019-08-07 09:40:03,556-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282) 2019-08-07 09:40:47,132-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445) 2019-08-07 09:44:53,564-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282) 2019-08-07 09:46:38,604-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445)” From: "Vrgotic, Marko" <M.Vrgotic@activevideo.com> Date: Wednesday, 7 August 2019 at 09:09 To: "users@ovirt.org" <users@ovirt.org> Subject: oVirt 4.3.5 potential issue with NFS storage Dear oVIrt, This is my third oVirt platform in the company, but first time I am seeing following logs: “2019-08-07 16:00:16,099Z INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-51) [1b85e637] Lock freed to object 'EngineLock:{exclusiveLocks='[2350ee82-94ed-4f90-9366-451e0104d1d6=PROVIDER]', sharedLocks=''}' 2019-08-07 16:00:25,618Z WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37723) [] domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' in problem 'PROBLEMATIC'. vds: 'ovirt-sj-05.ictv.com' 2019-08-07 16:00:40,630Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37735) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-05.ictv.com' 2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-01.ictv.com' 2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' has recovered from problem. No active host in the DC is reporting it as problematic, so clearing the domain recovery timer.” Can you help me understanding why is this being reported? This setup is: 5HOSTS, 3 in HA SelfHostedEngine Version 4.3.5 NFS based Netapp storage, version 4.1 “10.210.13.64:/ovirt_hosted_engine on /rhev/data-center/mnt/10.210.13.64:_ovirt__hosted__engine type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64) 10.210.13.64:/ovirt_production on /rhev/data-center/mnt/10.210.13.64:_ovirt__production type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64) tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=9878396k,mode=700)” First mount is SHE dedicated storage. Second mount “ovirt_produciton” is for other VM Guests. Kindly awaiting your reply. Marko Vrgotic

this means vdsm lost connectivity to the storage, but it also looks like it recovered eventually On Thu, Aug 8, 2019 at 12:26 PM Vrgotic, Marko <M.Vrgotic@activevideo.com> wrote:
Another one that seem to be related:
2019-08-07 14:43:59,069-0700 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/10.210.13.64:_ovirt__production/6effda5e-1a0d-4312-bf93-d97fa9eb5aee/dom_md/metadata (monitor:499)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 497, in _pathChecked
delay = result.delay()
File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line 391, in delay
raise exception.MiscFileReadException(self.path, self.rc, self.err)
MiscFileReadException: Internal file read failure: (u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/6effda5e-1a0d-4312-bf93-d97fa9eb5aee/dom_md/metadata', 1, 'Read timeout')
2019-08-07 14:43:59,116-0700 WARN (monitor/6effda5) [storage.Monitor] Host id for domain 6effda5e-1a0d-4312-bf93-d97fa9eb5aee was released (id: 1) (monitor:445)
*From: *"Vrgotic, Marko" <M.Vrgotic@activevideo.com> *Date: *Wednesday, 7 August 2019 at 09:50 *To: *"users@ovirt.org" <users@ovirt.org> *Subject: *Re: oVirt 4.3.5 potential issue with NFS storage
Log line form VDSM:
“[root@ovirt-sj-05 ~]# tail -f /var/log/vdsm/vdsm.log | grep WARN
2019-08-07 09:40:03,556-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282)
2019-08-07 09:40:47,132-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445)
2019-08-07 09:44:53,564-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282)
2019-08-07 09:46:38,604-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445)”
*From: *"Vrgotic, Marko" <M.Vrgotic@activevideo.com> *Date: *Wednesday, 7 August 2019 at 09:09 *To: *"users@ovirt.org" <users@ovirt.org> *Subject: *oVirt 4.3.5 potential issue with NFS storage
Dear oVIrt,
This is my third oVirt platform in the company, but first time I am seeing following logs:
“2019-08-07 16:00:16,099Z INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-51) [1b85e637] Lock freed to object 'EngineLock:{exclusiveLocks='[2350ee82-94ed-4f90-9366-451e0104d1d6=PROVIDER]', sharedLocks=''}'
2019-08-07 16:00:25,618Z WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37723) [] domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' in problem 'PROBLEMATIC'. vds: 'ovirt-sj-05.ictv.com'
2019-08-07 16:00:40,630Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37735) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-05.ictv.com'
2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-01.ictv.com'
2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' has recovered from problem. No active host in the DC is reporting it as problematic, so clearing the domain recovery timer.”
Can you help me understanding why is this being reported?
This setup is:
5HOSTS, 3 in HA
SelfHostedEngine
Version 4.3.5
NFS based Netapp storage, version 4.1
“10.210.13.64:/ovirt_hosted_engine on /rhev/data-center/mnt/10.210.13.64:_ovirt__hosted__engine type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64)
10.210.13.64:/ovirt_production on /rhev/data-center/mnt/10.210.13.64:_ovirt__production type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=9878396k,mode=700)”
First mount is SHE dedicated storage.
Second mount “ovirt_produciton” is for other VM Guests.
Kindly awaiting your reply.
Marko Vrgotic _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/H4FH6GYAYLUP5O...

Hi, Can you please clarify the flow you're doing? Also, can you please attach full vdsm and engine logs? *Regards,* *Shani Leviim* On Thu, Aug 8, 2019 at 6:25 AM Vrgotic, Marko <M.Vrgotic@activevideo.com> wrote:
Log line form VDSM:
“[root@ovirt-sj-05 ~]# tail -f /var/log/vdsm/vdsm.log | grep WARN
2019-08-07 09:40:03,556-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282)
2019-08-07 09:40:47,132-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445)
2019-08-07 09:44:53,564-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282)
2019-08-07 09:46:38,604-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445)”
*From: *"Vrgotic, Marko" <M.Vrgotic@activevideo.com> *Date: *Wednesday, 7 August 2019 at 09:09 *To: *"users@ovirt.org" <users@ovirt.org> *Subject: *oVirt 4.3.5 potential issue with NFS storage
Dear oVIrt,
This is my third oVirt platform in the company, but first time I am seeing following logs:
“2019-08-07 16:00:16,099Z INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-51) [1b85e637] Lock freed to object 'EngineLock:{exclusiveLocks='[2350ee82-94ed-4f90-9366-451e0104d1d6=PROVIDER]', sharedLocks=''}'
2019-08-07 16:00:25,618Z WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37723) [] domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' in problem 'PROBLEMATIC'. vds: 'ovirt-sj-05.ictv.com'
2019-08-07 16:00:40,630Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37735) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-05.ictv.com'
2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-01.ictv.com'
2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' has recovered from problem. No active host in the DC is reporting it as problematic, so clearing the domain recovery timer.”
Can you help me understanding why is this being reported?
This setup is:
5HOSTS, 3 in HA
SelfHostedEngine
Version 4.3.5
NFS based Netapp storage, version 4.1
“10.210.13.64:/ovirt_hosted_engine on /rhev/data-center/mnt/10.210.13.64:_ovirt__hosted__engine type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64)
10.210.13.64:/ovirt_production on /rhev/data-center/mnt/10.210.13.64:_ovirt__production type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=9878396k,mode=700)”
First mount is SHE dedicated storage.
Second mount “ovirt_produciton” is for other VM Guests.
Kindly awaiting your reply.
Marko Vrgotic _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ICRKHD3GXTPQEZ...

Hey Shanii, Thank you for the reply. Sure, I will attach the full logs asap. What do you mean by “flow you are doing”? Kindly awaiting your reply. Marko Vrgotic From: Shani Leviim <sleviim@redhat.com> Date: Thursday, 8 August 2019 at 00:01 To: "Vrgotic, Marko" <M.Vrgotic@activevideo.com> Cc: "users@ovirt.org" <users@ovirt.org> Subject: Re: [ovirt-users] Re: oVirt 4.3.5 potential issue with NFS storage Hi, Can you please clarify the flow you're doing? Also, can you please attach full vdsm and engine logs? Regards, Shani Leviim On Thu, Aug 8, 2019 at 6:25 AM Vrgotic, Marko <M.Vrgotic@activevideo.com<mailto:M.Vrgotic@activevideo.com>> wrote: Log line form VDSM: “[root@ovirt-sj-05 ~]# tail -f /var/log/vdsm/vdsm.log | grep WARN 2019-08-07 09:40:03,556-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282) 2019-08-07 09:40:47,132-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445) 2019-08-07 09:44:53,564-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282) 2019-08-07 09:46:38,604-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445)” From: "Vrgotic, Marko" <M.Vrgotic@activevideo.com<mailto:M.Vrgotic@activevideo.com>> Date: Wednesday, 7 August 2019 at 09:09 To: "users@ovirt.org<mailto:users@ovirt.org>" <users@ovirt.org<mailto:users@ovirt.org>> Subject: oVirt 4.3.5 potential issue with NFS storage Dear oVIrt, This is my third oVirt platform in the company, but first time I am seeing following logs: “2019-08-07 16:00:16,099Z INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-51) [1b85e637] Lock freed to object 'EngineLock:{exclusiveLocks='[2350ee82-94ed-4f90-9366-451e0104d1d6=PROVIDER]', sharedLocks=''}' 2019-08-07 16:00:25,618Z WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37723) [] domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' in problem 'PROBLEMATIC'. vds: 'ovirt-sj-05.ictv.com<http://ovirt-sj-05.ictv.com>' 2019-08-07 16:00:40,630Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37735) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-05.ictv.com<http://ovirt-sj-05.ictv.com>' 2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-01.ictv.com<http://ovirt-sj-01.ictv.com>' 2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' has recovered from problem. No active host in the DC is reporting it as problematic, so clearing the domain recovery timer.” Can you help me understanding why is this being reported? This setup is: 5HOSTS, 3 in HA SelfHostedEngine Version 4.3.5 NFS based Netapp storage, version 4.1 “10.210.13.64:/ovirt_hosted_engine on /rhev/data-center/mnt/10.210.13.64:_ovirt__hosted__engine type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64) 10.210.13.64:/ovirt_production on /rhev/data-center/mnt/10.210.13.64:_ovirt__production type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64) tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=9878396k,mode=700)” First mount is SHE dedicated storage. Second mount “ovirt_produciton” is for other VM Guests. Kindly awaiting your reply. Marko Vrgotic _______________________________________________ Users mailing list -- users@ovirt.org<mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org<mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ICRKHD3GXTPQEZ...

Log files from ovirt engine and ovirt-sj-05 vdsm attached. Its related to host named: ovirt-sj-05.ictv.com Kindly awaiting your reply. — — — Met vriendelijke groet / Kind regards, Marko Vrgotic From: "Vrgotic, Marko" <M.Vrgotic@activevideo.com> Date: Thursday, 8 August 2019 at 17:02 To: Shani Leviim <sleviim@redhat.com> Cc: "users@ovirt.org" <users@ovirt.org> Subject: Re: [ovirt-users] Re: oVirt 4.3.5 potential issue with NFS storage Hey Shanii, Thank you for the reply. Sure, I will attach the full logs asap. What do you mean by “flow you are doing”? Kindly awaiting your reply. Marko Vrgotic From: Shani Leviim <sleviim@redhat.com> Date: Thursday, 8 August 2019 at 00:01 To: "Vrgotic, Marko" <M.Vrgotic@activevideo.com> Cc: "users@ovirt.org" <users@ovirt.org> Subject: Re: [ovirt-users] Re: oVirt 4.3.5 potential issue with NFS storage Hi, Can you please clarify the flow you're doing? Also, can you please attach full vdsm and engine logs? Regards, Shani Leviim On Thu, Aug 8, 2019 at 6:25 AM Vrgotic, Marko <M.Vrgotic@activevideo.com<mailto:M.Vrgotic@activevideo.com>> wrote: Log line form VDSM: “[root@ovirt-sj-05 ~]# tail -f /var/log/vdsm/vdsm.log | grep WARN 2019-08-07 09:40:03,556-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282) 2019-08-07 09:40:47,132-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445) 2019-08-07 09:44:53,564-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282) 2019-08-07 09:46:38,604-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445)” From: "Vrgotic, Marko" <M.Vrgotic@activevideo.com<mailto:M.Vrgotic@activevideo.com>> Date: Wednesday, 7 August 2019 at 09:09 To: "users@ovirt.org<mailto:users@ovirt.org>" <users@ovirt.org<mailto:users@ovirt.org>> Subject: oVirt 4.3.5 potential issue with NFS storage Dear oVIrt, This is my third oVirt platform in the company, but first time I am seeing following logs: “2019-08-07 16:00:16,099Z INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-51) [1b85e637] Lock freed to object 'EngineLock:{exclusiveLocks='[2350ee82-94ed-4f90-9366-451e0104d1d6=PROVIDER]', sharedLocks=''}' 2019-08-07 16:00:25,618Z WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37723) [] domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' in problem 'PROBLEMATIC'. vds: 'ovirt-sj-05.ictv.com<http://ovirt-sj-05.ictv.com>' 2019-08-07 16:00:40,630Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37735) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-05.ictv.com<http://ovirt-sj-05.ictv.com>' 2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-01.ictv.com<http://ovirt-sj-01.ictv.com>' 2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' has recovered from problem. No active host in the DC is reporting it as problematic, so clearing the domain recovery timer.” Can you help me understanding why is this being reported? This setup is: 5HOSTS, 3 in HA SelfHostedEngine Version 4.3.5 NFS based Netapp storage, version 4.1 “10.210.13.64:/ovirt_hosted_engine on /rhev/data-center/mnt/10.210.13.64:_ovirt__hosted__engine type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64) 10.210.13.64:/ovirt_production on /rhev/data-center/mnt/10.210.13.64:_ovirt__production type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64) tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=9878396k,mode=700)” First mount is SHE dedicated storage. Second mount “ovirt_produciton” is for other VM Guests. Kindly awaiting your reply. Marko Vrgotic _______________________________________________ Users mailing list -- users@ovirt.org<mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org<mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ICRKHD3GXTPQEZ...

Hi Marko, Is seems that there's a connectivity problem with host 10.210.13.64. Can you please make sure the metadata under /rhev/data-center/mnt/10.210.13.64:_ovirt__production/6effda5e-1a0d-4312-bf93-d97fa9eb5aee/dom_md/metadata is accessible? *Regards,* *Shani Leviim* On Sat, Aug 10, 2019 at 2:57 AM Vrgotic, Marko <M.Vrgotic@activevideo.com> wrote:
Log files from ovirt engine and ovirt-sj-05 vdsm attached.
Its related to host named: ovirt-sj-05.ictv.com
Kindly awaiting your reply.
— — — Met vriendelijke groet / Kind regards,
*Marko Vrgotic*
*From: *"Vrgotic, Marko" <M.Vrgotic@activevideo.com> *Date: *Thursday, 8 August 2019 at 17:02 *To: *Shani Leviim <sleviim@redhat.com> *Cc: *"users@ovirt.org" <users@ovirt.org> *Subject: *Re: [ovirt-users] Re: oVirt 4.3.5 potential issue with NFS storage
Hey Shanii,
Thank you for the reply.
Sure, I will attach the full logs asap.
What do you mean by “flow you are doing”?
Kindly awaiting your reply.
Marko Vrgotic
*From: *Shani Leviim <sleviim@redhat.com> *Date: *Thursday, 8 August 2019 at 00:01 *To: *"Vrgotic, Marko" <M.Vrgotic@activevideo.com> *Cc: *"users@ovirt.org" <users@ovirt.org> *Subject: *Re: [ovirt-users] Re: oVirt 4.3.5 potential issue with NFS storage
Hi,
Can you please clarify the flow you're doing?
Also, can you please attach full vdsm and engine logs?
*Regards,*
*Shani Leviim*
On Thu, Aug 8, 2019 at 6:25 AM Vrgotic, Marko <M.Vrgotic@activevideo.com> wrote:
Log line form VDSM:
“[root@ovirt-sj-05 ~]# tail -f /var/log/vdsm/vdsm.log | grep WARN
2019-08-07 09:40:03,556-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282)
2019-08-07 09:40:47,132-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445)
2019-08-07 09:44:53,564-0700 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/10.210.13.64:_ovirt__production/bda97276-a399-448f-9113-017972f6b55a/dom_md/metadata' is blocked for 20.00 seconds (check:282)
2019-08-07 09:46:38,604-0700 WARN (monitor/bda9727) [storage.Monitor] Host id for domain bda97276-a399-448f-9113-017972f6b55a was released (id: 5) (monitor:445)”
*From: *"Vrgotic, Marko" <M.Vrgotic@activevideo.com> *Date: *Wednesday, 7 August 2019 at 09:09 *To: *"users@ovirt.org" <users@ovirt.org> *Subject: *oVirt 4.3.5 potential issue with NFS storage
Dear oVIrt,
This is my third oVirt platform in the company, but first time I am seeing following logs:
“2019-08-07 16:00:16,099Z INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-51) [1b85e637] Lock freed to object 'EngineLock:{exclusiveLocks='[2350ee82-94ed-4f90-9366-451e0104d1d6=PROVIDER]', sharedLocks=''}'
2019-08-07 16:00:25,618Z WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37723) [] domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' in problem 'PROBLEMATIC'. vds: 'ovirt-sj-05.ictv.com'
2019-08-07 16:00:40,630Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37735) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-05.ictv.com'
2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' recovered from problem. vds: 'ovirt-sj-01.ictv.com'
2019-08-07 16:00:40,652Z INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-37737) [] Domain 'bda97276-a399-448f-9113-017972f6b55a:ovirt_production' has recovered from problem. No active host in the DC is reporting it as problematic, so clearing the domain recovery timer.”
Can you help me understanding why is this being reported?
This setup is:
5HOSTS, 3 in HA
SelfHostedEngine
Version 4.3.5
NFS based Netapp storage, version 4.1
“10.210.13.64:/ovirt_hosted_engine on /rhev/data-center/mnt/10.210.13.64:_ovirt__hosted__engine type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64)
10.210.13.64:/ovirt_production on /rhev/data-center/mnt/10.210.13.64:_ovirt__production type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.210.11.14,local_lock=none,addr=10.210.13.64)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=9878396k,mode=700)”
First mount is SHE dedicated storage.
Second mount “ovirt_produciton” is for other VM Guests.
Kindly awaiting your reply.
Marko Vrgotic
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ICRKHD3GXTPQEZ...
participants (3)
-
Benny Zlotnik
-
Shani Leviim
-
Vrgotic, Marko