Hi,
any ideas if or how I can recover the storage domain? I will need to destroy
it, as the ongoing scsi scans are becoming an impediment.
Thank you and all the best,
Oliver
-----Ursprüngliche Nachricht-----
Von: Oliver Albl <oliver.albl(a)fabasoft.com>
Gesendet: Dienstag, 5. November 2019 11:20
An: users(a)ovirt.org
Betreff: [ovirt-users] Re: Cannot activate/deactivate storage domain
On Mon, Nov 4, 2019 at 9:18 PM Albl, Oliver
<Oliver.Albl(a)fabasoft.com>
wrote:
What was the last change in the system? upgrade? network change? storage
change?
Last change was four weeks ago ovirt upgrade from 4.3.3 to 4.3.6.7 (including
CentOS hosts to 7.7 1908)
This is expected if some domain is not accessible on all hosts.
This means sanlock timed out renewing the lockspace
If a host cannot access all storage domain in the DC, the system set
it to non-operational, and will probably try to reconnect it later.
This means reading 4k from start of the metadata lv took 9.6 seconds.
Something in
the way to storage is bad (kernel, network, storage).
We 20 seconds (4 retires, 5 seconds per retry) gracetime in multipath
when there are no active paths, before I/O fails, pausing the VM. We
also resume paused VMs when storage monitoring works again, so maybe
the VM were paused and resumed.
However for storage monitoring we have strict 10 seconds timeout. If
reading from the metadata lv times out or fail and does not operated
normally after
5 minutes, the
domain will become inactive.
This can explain the read timeouts.
This looks the right way to troubleshoot this.
We need vdsm logs to understand this failure.
This does not mean OVF is corrupted, only that we could not store new
data. The older data on the other OVFSTORE disk is probably fine.
Hopefuly the system will not try to write to the other OVFSTORE disk
overwriting the last good version.
This is normal, the first 2048 bytes are always zeroes. This area was
used for domain metadata in older versions.
Please share more details:
- output of "lsblk"
- output of "multipath -ll"
- output of "/usr/libexec/vdsm/fc-scan -v"
- output of "vgs -o +tags problem-domain-id"
- output of "lvs -o +tags problem-domain-id"
- contents of /etc/multipath.conf
- contents of /etc/multipath.conf.d/*.conf
- /var/log/messages since the issue started
- /var/log/vdsm/vdsm.log* since the issue started on one of the hosts
A bug is probably the best place to keep these logs and make it easy to
trac.
Please see
https://bugzilla.redhat.com/show_bug.cgi?id=1768821
Thanks,
Nir
Thank you!
Oliver
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QZ5ZN2S7N54...