
RequestError: failed to read metadata: [Errno 2] No such file or directory: '/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8'
ls -al /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8 -rw-rw----. 1 vdsm kvm 1028096 Jan 12 09:59 /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
Is this due to the symlink problem you guys are referring to that was addressed in RC1 or something else?
No, this file is the symlink. It should point to somewhere inside /rhev/. I see it is a 1G file in your case. That is really interesting. Can you please stop all hosted engine tooling (ovirt-ha-agent, ovirt-ha-broker), move the file (metadata file is not important when services are stopped, but better safe than sorry) and restart all services again?
Could there possibly be a permissions problem somewhere?
Maybe, but the file itself looks out of the ordinary. I wonder how it got there. Best regards Martin Sivak On Fri, Jan 12, 2018 at 3:09 PM, Jayme <jaymef@gmail.com> wrote:
Thanks for the help thus far. Storage could be related but all other VMs on same storage are running ok. The storage is mounted via NFS from within one of the three hosts, I realize this is not ideal. This was setup by a previous admin more as a proof of concept and VMs were put on there that should not have been placed in a proof of concept environment.. it was intended to be rebuilt with proper storage down the road.
So the storage is on HOST0 and the other hosts mount NFS
cultivar0.grove.silverorange.com:/exports/data 4861742080 1039352832 3822389248 22% /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_data cultivar0.grove.silverorange.com:/exports/iso 4861742080 1039352832 3822389248 22% /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_iso cultivar0.grove.silverorange.com:/exports/import_export 4861742080 1039352832 3822389248 22% /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_import__export cultivar0.grove.silverorange.com:/exports/hosted_engine 4861742080 1039352832 3822389248 22% /rhev/data-center/mnt/cultivar0.grove.silverorange.com:_exports_hosted__engine
Like I said, the VM data storage itself seems to be working ok, as all other VMs appear to be running.
I'm curious why the broker log says this file is not found when it is correct and I can see the file at that path:
RequestError: failed to read metadata: [Errno 2] No such file or directory: '/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8'
ls -al /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8 -rw-rw----. 1 vdsm kvm 1028096 Jan 12 09:59 /var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
Is this due to the symlink problem you guys are referring to that was addressed in RC1 or something else? Could there possibly be a permissions problem somewhere?
Assuming that all three hosts have 4.2 rpms installed and the host engine will not start is it safe for me to update hosts to 4.2 RC1 rpms? Or perhaps install that repo and *only* update the ovirt HA packages? Assuming that I cannot yet apply the same updates to the inaccessible hosted engine VM.
I should also mention one more thing. I originally upgraded the engine VM first using new RPMS then engine-setup. It failed due to not being in global maintenance, so I set global maintenance and ran it again, which appeared to complete as intended but never came back up after. Just in case this might have anything at all to do with what could have happened.
Thanks very much again, I very much appreciate the help!
- Jayme
On Fri, Jan 12, 2018 at 8:44 AM, Simone Tiraboschi <stirabos@redhat.com> wrote:
On Fri, Jan 12, 2018 at 11:11 AM, Martin Sivak <msivak@redhat.com> wrote:
Hi,
the hosted engine agent issue might be fixed by restarting ovirt-ha-broker or updating to newest ovirt-hosted-engine-ha and -setup. We improved handling of the missing symlink.
Available just in oVirt 4.2.1 RC1
All the other issues seem to point to some storage problem I am afraid.
You said you started the VM, do you see it in virsh -r list?
Best regards
Martin Sivak
On Thu, Jan 11, 2018 at 10:00 PM, Jayme <jaymef@gmail.com> wrote:
Please help, I'm really not sure what else to try at this point. Thank you for reading!
I'm still working on trying to get my hosted engine running after a botched upgrade to 4.2. Storage is NFS mounted from within one of the hosts. Right now I have 3 centos7 hosts that are fully updated with yum packages from ovirt 4.2, the engine was fully updated with yum packages and failed to come up after reboot. As of right now, everything should have full yum updates and all having 4.2 rpms. I have global maintenance mode on right now and started hosted-engine on one of the three host and the status is currently: Engine status : {"reason": "failed liveliness checkā; "health": "bad", "vm": "up", "detail": "Up"}
this is what I get when trying to enter hosted-vm --console
The engine VM is running on this host
error: failed to get domain 'HostedEngine'
error: Domain not found: no domain with matching name 'HostedEngine'
Here are logs from various sources when I start the VM on HOST3:
hosted-engine --vm-start
Command VM.getStats with args {'vmID': '4013c829-c9d7-4b72-90d5-6fe58137504c'} failed:
(code=1, message=Virtual machine does not exist: {'vmId': u'4013c829-c9d7-4b72-90d5-6fe58137504c'})
Jan 11 16:55:57 cultivar3 systemd-machined: New machine qemu-110-Cultivar.
Jan 11 16:55:57 cultivar3 systemd: Started Virtual Machine qemu-110-Cultivar.
Jan 11 16:55:57 cultivar3 systemd: Starting Virtual Machine qemu-110-Cultivar.
Jan 11 16:55:57 cultivar3 kvm: 3 guests now active
==> /var/log/vdsm/vdsm.log <==
File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
ret = func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2718, in getStorageDomainInfo
dom = self.validateSdUUID(sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 304, in validateSdUUID
sdDom.validate()
File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 515, in validate
raise se.StorageDomainAccessError(self.sdUUID)
StorageDomainAccessError: Domain is either partially accessible or entirely inaccessible: (u'248f46f0-d793-4581-9810-c9d965e2f286',)
jsonrpc/2::ERROR::2018-01-11 16:55:16,144::dispatcher::82::storage.Dispatcher::(wrapper) FINISH getStorageDomainInfo error=Domain is either partially accessible or entirely inaccessible: (u'248f46f0-d793-4581-9810-c9d965e2f286',)
==> /var/log/libvirt/qemu/Cultivar.log <==
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name guest=Cultivar,debug-threads=on -S -object
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-108-Cultivar/master-key.aes -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off,dump-guest-core=off -cpu Conroe -m 8192 -realtime mlock=off -smp 2,maxcpus=16,sockets=16,cores=1,threads=1 -uuid 4013c829-c9d7-4b72-90d5-6fe58137504c -smbios 'type=1,manufacturer=oVirt,product=oVirt
Node,version=7-4.1708.el7.centos,serial=44454C4C-4300-1034-8035-CAC04F424331,uuid=4013c829-c9d7-4b72-90d5-6fe58137504c' -no-user-config -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-108-Cultivar/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2018-01-11T20:33:19,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-reboot -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
file=/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/c2dde892-f978-4dfc-a421-c8e04cf387f9/23aa0a66-fa6c-4967-a1e5-fbe47c0cd705,format=raw,if=none,id=drive-virtio-disk0,serial=c2dde892-f978-4dfc-a421-c8e04cf387f9,cache=none,werror=stop,rerror=stop,aio=threads -device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-ide0-1-0,readonly=on -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=32 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=00:16:3e:7f:d6:83,bus=pci.0,addr=0x3 -chardev
socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/4013c829-c9d7-4b72-90d5-6fe58137504c.com.redhat.rhevm.vdsm,server,nowait -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev
socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/4013c829-c9d7-4b72-90d5-6fe58137504c.org.qemu.guest_agent.0,server,nowait -device
virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel2,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 -chardev
socket,id=charchannel3,path=/var/lib/libvirt/qemu/channels/4013c829-c9d7-4b72-90d5-6fe58137504c.org.ovirt.hosted-engine-setup.0,server,nowait -device
virtserialport,bus=virtio-serial0.0,nr=4,chardev=charchannel3,id=channel3,name=org.ovirt.hosted-engine-setup.0 -chardev pty,id=charconsole0 -device virtconsole,chardev=charconsole0,id=console0 -spice
tls-port=5900,addr=0,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=default,seamless-migration=on -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x5 -msg timestamp=on
2018-01-11T20:33:19.699999Z qemu-kvm: -chardev pty,id=charconsole0: char device redirected to /dev/pts/2 (label charconsole0)
2018-01-11 20:38:11.640+0000: shutting down, reason=shutdown
2018-01-11 20:39:02.122+0000: starting up libvirt version: 3.2.0, package: 14.el7_4.7 (CentOS BuildSystem <http://bugs.centos.org>, 2018-01-04-19:31:34, c1bm.rdu2.centos.org), qemu version: 2.9.0(qemu-kvm-ev-2.9.0-16.el7_4.13.1), hostname: cultivar3
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name guest=Cultivar,debug-threads=on -S -object
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-109-Cultivar/master-key.aes -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off,dump-guest-core=off -cpu Conroe -m 8192 -realtime mlock=off -smp 2,maxcpus=16,sockets=16,cores=1,threads=1 -uuid 4013c829-c9d7-4b72-90d5-6fe58137504c -smbios 'type=1,manufacturer=oVirt,product=oVirt
Node,version=7-4.1708.el7.centos,serial=44454C4C-4300-1034-8035-CAC04F424331,uuid=4013c829-c9d7-4b72-90d5-6fe58137504c' -no-user-config -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-109-Cultivar/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2018-01-11T20:39:02,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-reboot -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
file=/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/c2dde892-f978-4dfc-a421-c8e04cf387f9/23aa0a66-fa6c-4967-a1e5-fbe47c0cd705,format=raw,if=none,id=drive-virtio-disk0,serial=c2dde892-f978-4dfc-a421-c8e04cf387f9,cache=none,werror=stop,rerror=stop,aio=threads -device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-ide0-1-0,readonly=on -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=32 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=00:16:3e:7f:d6:83,bus=pci.0,addr=0x3 -chardev
socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/4013c829-c9d7-4b72-90d5-6fe58137504c.com.redhat.rhevm.vdsm,server,nowait -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev
socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/4013c829-c9d7-4b72-90d5-6fe58137504c.org.qemu.guest_agent.0,server,nowait -device
virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel2,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 -chardev
socket,id=charchannel3,path=/var/lib/libvirt/qemu/channels/4013c829-c9d7-4b72-90d5-6fe58137504c.org.ovirt.hosted-engine-setup.0,server,nowait -device
virtserialport,bus=virtio-serial0.0,nr=4,chardev=charchannel3,id=channel3,name=org.ovirt.hosted-engine-setup.0 -chardev pty,id=charconsole0 -device virtconsole,chardev=charconsole0,id=console0 -spice
tls-port=5900,addr=0,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=default,seamless-migration=on -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x5 -msg timestamp=on
2018-01-11T20:39:02.380773Z qemu-kvm: -chardev pty,id=charconsole0: char device redirected to /dev/pts/2 (label charconsole0)
2018-01-11 20:53:11.407+0000: shutting down, reason=shutdown
2018-01-11 20:55:57.210+0000: starting up libvirt version: 3.2.0, package: 14.el7_4.7 (CentOS BuildSystem <http://bugs.centos.org>, 2018-01-04-19:31:34, c1bm.rdu2.centos.org), qemu version: 2.9.0(qemu-kvm-ev-2.9.0-16.el7_4.13.1), hostname: cultivar3.grove.silverorange.com
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name guest=Cultivar,debug-threads=on -S -object
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-110-Cultivar/master-key.aes -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off,dump-guest-core=off -cpu Conroe -m 8192 -realtime mlock=off -smp 2,maxcpus=16,sockets=16,cores=1,threads=1 -uuid 4013c829-c9d7-4b72-90d5-6fe58137504c -smbios 'type=1,manufacturer=oVirt,product=oVirt
Node,version=7-4.1708.el7.centos,serial=44454C4C-4300-1034-8035-CAC04F424331,uuid=4013c829-c9d7-4b72-90d5-6fe58137504c' -no-user-config -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-110-Cultivar/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2018-01-11T20:55:57,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-reboot -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
file=/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/c2dde892-f978-4dfc-a421-c8e04cf387f9/23aa0a66-fa6c-4967-a1e5-fbe47c0cd705,format=raw,if=none,id=drive-virtio-disk0,serial=c2dde892-f978-4dfc-a421-c8e04cf387f9,cache=none,werror=stop,rerror=stop,aio=threads -device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-ide0-1-0,readonly=on -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=32 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=00:16:3e:7f:d6:83,bus=pci.0,addr=0x3 -chardev
socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/4013c829-c9d7-4b72-90d5-6fe58137504c.com.redhat.rhevm.vdsm,server,nowait -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev
socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/4013c829-c9d7-4b72-90d5-6fe58137504c.org.qemu.guest_agent.0,server,nowait -device
virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel2,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 -chardev
socket,id=charchannel3,path=/var/lib/libvirt/qemu/channels/4013c829-c9d7-4b72-90d5-6fe58137504c.org.ovirt.hosted-engine-setup.0,server,nowait -device
virtserialport,bus=virtio-serial0.0,nr=4,chardev=charchannel3,id=channel3,name=org.ovirt.hosted-engine-setup.0 -chardev pty,id=charconsole0 -device virtconsole,chardev=charconsole0,id=console0 -spice
tls-port=5900,addr=0,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=default,seamless-migration=on -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x5 -msg timestamp=on
2018-01-11T20:55:57.468037Z qemu-kvm: -chardev pty,id=charconsole0: char device redirected to /dev/pts/2 (label charconsole0)
==> /var/log/ovirt-hosted-engine-ha/broker.log <==
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 151, in get_raw_stats
f = os.open(path, direct_flag | os.O_RDONLY | os.O_SYNC)
OSError: [Errno 2] No such file or directory:
'/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8'
StatusStorageThread::ERROR::2018-01-11
16:55:15,761::status_broker::92::ovirt_hosted_engine_ha.broker.status_broker.StatusBroker.Update::(run) Failed to read state.
Traceback (most recent call last):
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/status_broker.py", line 88, in run
self._storage_broker.get_raw_stats()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 162, in get_raw_stats
.format(str(e)))
RequestError: failed to read metadata: [Errno 2] No such file or directory:
'/var/run/vdsm/storage/248f46f0-d793-4581-9810-c9d965e2f286/14a20941-1b84-4b82-be8f-ace38d7c037a/8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8'
==> /var/log/ovirt-hosted-engine-ha/agent.log <==
result = refresh_method()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 519, in refresh_vm_conf
content = self._get_file_content_from_shared_storage(VM)
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 484, in _get_file_content_from_shared_storage
config_volume_path = self._get_config_volume_path()
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 188, in _get_config_volume_path
conf_vol_uuid
File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/heconflib.py", line 358, in get_volume_path
root=envconst.SD_RUN_DIR,
RuntimeError: Path to volume 4838749f-216d-406b-b245-98d0343fcf7f not found in /run/vdsm/storag
==> /var/log/vdsm/vdsm.log <==
periodic/42::ERROR::2018-01-11 16:56:11,446::vmstats::260::virt.vmstats::(send_metrics) VM metrics collection failed
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line 197, in send_metrics
data[prefix + '.cpu.usage'] = stat['cpuUsage']
KeyError: 'cpuUsage'
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users