The lock is the issue.
- try running sanlock client status on all hosts
- also make sure you do not have some forgotten host still connected
to the lockspace, but without ha daemons running (and with the VM)
I need to go to our president election now, I might check the email
later tonight.
Martin
On Fri, Jan 12, 2018 at 4:59 PM, Jayme <jaymef@gmail.com> wrote:
> Here are the newest logs from me trying to start hosted vm:
>
> ==> /var/log/messages <==
> Jan 12 11:58:14 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered blocking
> state
> Jan 12 11:58:14 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered disabled
> state
> Jan 12 11:58:14 cultivar0 kernel: device vnet4 entered promiscuous mode
> Jan 12 11:58:14 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered blocking
> state
> Jan 12 11:58:14 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered
> forwarding state
> Jan 12 11:58:14 cultivar0 lldpad: recvfrom(Event interface): No buffer space
> available
> Jan 12 11:58:14 cultivar0 NetworkManager[1092]: <info> [1515772694.8715]
> manager: (vnet4): new Tun device
> (/org/freedesktop/NetworkManager/Devices/140)
> Jan 12 11:58:14 cultivar0 NetworkManager[1092]: <info> [1515772694.8795]
> device (vnet4): state change: unmanaged -> unavailable (reason
> 'connection-assumed') [10 20 41]
>
> ==> /var/log/libvirt/qemu/Cultivar.log <==
> 2018-01-12 15:58:14.879+0000: starting up libvirt version: 3.2.0, package:
> 14.el7_4.7 (CentOS BuildSystem <http://bugs.centos.org>,
> 2018-01-04-19:31:34, c1bm.rdu2.centos.org), qemu version:
> 2.9.0(qemu-kvm-ev-2.9.0-16.el7_4.13.1), hostname:
> cultivar0.grove.silverorange.com
> LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
> QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name
> guest=Cultivar,debug-threads=on -S -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/ qemu/domain-119-Cultivar/ master-key.aes
> -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off,dump-guest-core=off -cpu
> Conroe -m 8192 -realtime mlock=off -smp
> 2,maxcpus=16,sockets=16,cores=1,threads=1 -uuid
> 4013c829-c9d7-4b72-90d5-6fe58137504c -smbios
> 'type=1,manufacturer=oVirt,product=oVirt
> Node,version=7-4.1708.el7.centos,serial=44454C4C-3300- 1042-8031-B4C04F4B4831,uuid= 4013c829-c9d7-4b72-90d5- 6fe58137504c'
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain- 119-Cultivar/monitor.sock, server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=2018-01-12T15:58:14,driftfix=slew -global
> kvm-pit.lost_tick_policy=delay -no-hpet -no-reboot -boot strict=on -device
> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
> file=/var/run/vdsm/storage/248f46f0-d793-4581-9810- c9d965e2f286/c2dde892-f978- 4dfc-a421-c8e04cf387f9/ 23aa0a66-fa6c-4967-a1e5- fbe47c0cd705,format=raw,if= none,id=drive-virtio-disk0, serial=c2dde892-f978-4dfc- a421-c8e04cf387f9,cache=none, werror=stop,rerror=stop,aio= threads
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive- virtio-disk0,id=virtio-disk0, bootindex=1
> -drive if=none,id=drive-ide0-1-0,readonly=on -device
> ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev
> tap,fd=35,id=hostnet0,vhost=on,vhostfd=38 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:16:3e: 7f:d6:83,bus=pci.0,addr=0x3
> -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/ 4013c829-c9d7-4b72-90d5- 6fe58137504c.com.redhat.rhevm. vdsm,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=1,chardev= charchannel0,id=channel0,name= com.redhat.rhevm.vdsm
> -chardev
> socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/ 4013c829-c9d7-4b72-90d5- 6fe58137504c.org.qemu.guest_ agent.0,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=2,chardev= charchannel1,id=channel1,name= org.qemu.guest_agent.0
> -chardev spicevmc,id=charchannel2,name=vdagent -device
> virtserialport,bus=virtio-serial0.0,nr=3,chardev= charchannel2,id=channel2,name= com.redhat.spice.0
> -chardev
> socket,id=charchannel3,path=/var/lib/libvirt/qemu/channels/ 4013c829-c9d7-4b72-90d5- 6fe58137504c.org.ovirt.hosted- engine-setup.0,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=4,chardev= charchannel3,id=channel3,name= org.ovirt.hosted-engine-setup. 0
> -chardev pty,id=charconsole0 -device
> virtconsole,chardev=charconsole0,id=console0 -spice
> tls-port=5904,addr=0,x509-dir=/etc/pki/vdsm/libvirt-spice, tls-channel=default,seamless- migration=on
> -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -object
> rng-random,id=objrng0,filename=/dev/urandom -device
> virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x5 -msg timestamp=on
>
> ==> /var/log/messages <==
> Jan 12 11:58:14 cultivar0 NetworkManager[1092]: <info> [1515772694.8807]
> device (vnet4): state change: unavailable -> disconnected (reason 'none')
> [20 30 0]
> Jan 12 11:58:14 cultivar0 systemd-machined: New machine qemu-119-Cultivar.
> Jan 12 11:58:14 cultivar0 systemd: Started Virtual Machine
> qemu-119-Cultivar.
> Jan 12 11:58:14 cultivar0 systemd: Starting Virtual Machine
> qemu-119-Cultivar.
>
> ==> /var/log/libvirt/qemu/Cultivar.log <==
> 2018-01-12T15:58:15.094002Z qemu-kvm: -chardev pty,id=charconsole0: char
> device redirected to /dev/pts/1 (label charconsole0)
>
> ==> /var/log/messages <==
> Jan 12 11:58:15 cultivar0 kvm: 5 guests now active
>
> ==> /var/log/libvirt/qemu/Cultivar.log <==
> 2018-01-12 15:58:15.217+0000: shutting down, reason=failed
>
> ==> /var/log/messages <==
> Jan 12 11:58:15 cultivar0 libvirtd: 2018-01-12 15:58:15.217+0000: 1908:
> error : virLockManagerSanlockAcquire:1041 : resource busy: Failed to acquire
> lock: Lease is held by another host
>
> ==> /var/log/libvirt/qemu/Cultivar.log <==
> 2018-01-12T15:58:15.219934Z qemu-kvm: terminating on signal 15 from pid 1773
> (/usr/sbin/libvirtd)
>
> ==> /var/log/messages <==
> Jan 12 11:58:15 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered disabled
> state
> Jan 12 11:58:15 cultivar0 kernel: device vnet4 left promiscuous mode
> Jan 12 11:58:15 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered disabled
> state
> Jan 12 11:58:15 cultivar0 NetworkManager[1092]: <info> [1515772695.2348]
> device (vnet4): state change: disconnected -> unmanaged (reason 'unmanaged')
> [30 10 3]
> Jan 12 11:58:15 cultivar0 NetworkManager[1092]: <info> [1515772695.2349]
> device (vnet4): released from master device ovirtmgmt
> Jan 12 11:58:15 cultivar0 kvm: 4 guests now active
> Jan 12 11:58:15 cultivar0 systemd-machined: Machine qemu-119-Cultivar
> terminated.
>
> ==> /var/log/vdsm/vdsm.log <==
> vm/4013c829::ERROR::2018-01-12
> 11:58:15,444::vm::914::virt.vm::(_startUnderlyingVm)
> (vmId='4013c829-c9d7-4b72-90d5-6fe58137504c') The vm start process failed
> Traceback (most recent call last):
> File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 843, in
> _startUnderlyingVm
> self._run()
> File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2721, in
> _run
> dom.createWithFlags(flags)
> File "/usr/lib/python2.7/site-packages/vdsm/ libvirtconnection.py", line
> 126, in wrapper
> ret = f(*args, **kwargs)
> File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 512, in
> wrapper
> return func(inst, *args, **kwargs)
> File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1069, in
> createWithFlags
> if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed',
> dom=self)
> libvirtError: resource busy: Failed to acquire lock: Lease is held by
> another host
> jsonrpc/6::ERROR::2018-01-12
> 11:58:16,421::__init__::611::jsonrpc.JsonRpcServer::(_ handle_request)
> Internal server error
> Traceback (most recent call last):
> File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__. py", line 606,
> in _handle_request
> res = method(**params)
> File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 201, in
> _dynamicMethod
> result = fn(*methodArgs)
> File "<string>", line 2, in getAllVmIoTunePolicies
> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in
> method
> ret = func(*args, **kwargs)
> File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1354, in
> getAllVmIoTunePolicies
> io_tune_policies_dict = self._cif.getAllVmIoTunePolicies()
> File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 524, in
> getAllVmIoTunePolicies
> 'current_values': v.getIoTune()}
> File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3481, in
> getIoTune
> result = self.getIoTuneResponse()
> File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3500, in
> getIoTuneResponse
> res = self._dom.blockIoTune(
> File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain. py", line 47,
> in __getattr__
> % self.vmid)
> NotConnectedError: VM '4013c829-c9d7-4b72-90d5-6fe58137504c' was not defined
> yet or was undefined
>
> ==> /var/log/messages <==
> Jan 12 11:58:16 cultivar0 journal: vdsm jsonrpc.JsonRpcServer ERROR Internal
> server error#012Traceback (most recent call last):#012 File
> "/usr/lib/python2.7/site-packages/yajsonrpc/__init__. py", line 606, in
> _handle_request#012 res = method(**params)#012 File
> "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 201, in
> _dynamicMethod#012 result = fn(*methodArgs)#012 File "<string>", line 2,
> in getAllVmIoTunePolicies#012 File
> "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in
> method#012 ret = func(*args, **kwargs)#012 File
> "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1354, in
> getAllVmIoTunePolicies#012 io_tune_policies_dict =
> self._cif.getAllVmIoTunePolicies()#012 File
> "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 524, in
> getAllVmIoTunePolicies#012 'current_values': v.getIoTune()}#012 File
> "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3481, in
> getIoTune#012 result = self.getIoTuneResponse()#012 File
> "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3500, in
> getIoTuneResponse#012 res = self._dom.blockIoTune(#012 File
> "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain. py", line 47, in
> __getattr__#012 % self.vmid)#012NotConnectedError: VM
> '4013c829-c9d7-4b72-90d5-6fe58137504c' was not defined yet or was undefined
>
> On Fri, Jan 12, 2018 at 11:55 AM, Jayme <jaymef@gmail.com> wrote:
>>
>> One other tidbit I noticed is that it seems like there are less errors if
>> I started in paused mode:
>>
>> but still shows: Engine status : {"reason": "bad vm
>> status", "health": "bad", "vm": "up", "detail": "Paused"}
>>
>> ==> /var/log/messages <==
>> Jan 12 11:55:05 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered
>> blocking state
>> Jan 12 11:55:05 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered
>> disabled state
>> Jan 12 11:55:05 cultivar0 kernel: device vnet4 entered promiscuous mode
>> Jan 12 11:55:05 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered
>> blocking state
>> Jan 12 11:55:05 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered
>> forwarding state
>> Jan 12 11:55:05 cultivar0 lldpad: recvfrom(Event interface): No buffer
>> space available
>> Jan 12 11:55:05 cultivar0 NetworkManager[1092]: <info> [1515772505.3625]
>> manager: (vnet4): new Tun device
>> (/org/freedesktop/NetworkManager/Devices/139)
>>
>> ==> /var/log/libvirt/qemu/Cultivar.log <==
>> 2018-01-12 15:55:05.370+0000: starting up libvirt version: 3.2.0, package:
>> 14.el7_4.7 (CentOS BuildSystem <http://bugs.centos.org>,
>> 2018-01-04-19:31:34, c1bm.rdu2.centos.org), qemu version:
>> 2.9.0(qemu-kvm-ev-2.9.0-16.el7_4.13.1), hostname:
>> cultivar0.grove.silverorange.com
>> LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
>> QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name
>> guest=Cultivar,debug-threads=on -S -object
>> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/ qemu/domain-118-Cultivar/ master-key.aes
>> -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off,dump-guest-core=off -cpu
>> Conroe -m 8192 -realtime mlock=off -smp
>> 2,maxcpus=16,sockets=16,cores=1,threads=1 -uuid
>> 4013c829-c9d7-4b72-90d5-6fe58137504c -smbios
>> 'type=1,manufacturer=oVirt,product=oVirt
>> Node,version=7-4.1708.el7.centos,serial=44454C4C-3300- 1042-8031-B4C04F4B4831,uuid= 4013c829-c9d7-4b72-90d5- 6fe58137504c'
>> -no-user-config -nodefaults -chardev
>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain- 118-Cultivar/monitor.sock, server,nowait
>> -mon chardev=charmonitor,id=monitor,mode=control -rtc
>> base=2018-01-12T15:55:05,driftfix=slew -global
>> kvm-pit.lost_tick_policy=delay -no-hpet -no-reboot -boot strict=on -device
>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
>> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
>> file=/var/run/vdsm/storage/248f46f0-d793-4581-9810- c9d965e2f286/c2dde892-f978- 4dfc-a421-c8e04cf387f9/ 23aa0a66-fa6c-4967-a1e5- fbe47c0cd705,format=raw,if= none,id=drive-virtio-disk0, serial=c2dde892-f978-4dfc- a421-c8e04cf387f9,cache=none, werror=stop,rerror=stop,aio= threads
>> -device
>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive- virtio-disk0,id=virtio-disk0, bootindex=1
>> -drive if=none,id=drive-ide0-1-0,readonly=on -device
>> ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev
>> tap,fd=35,id=hostnet0,vhost=on,vhostfd=38 -device
>> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:16:3e: 7f:d6:83,bus=pci.0,addr=0x3
>> -chardev
>> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/ 4013c829-c9d7-4b72-90d5- 6fe58137504c.com.redhat.rhevm. vdsm,server,nowait
>> -device
>> virtserialport,bus=virtio-serial0.0,nr=1,chardev= charchannel0,id=channel0,name= com.redhat.rhevm.vdsm
>> -chardev
>> socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/ 4013c829-c9d7-4b72-90d5- 6fe58137504c.org.qemu.guest_ agent.0,server,nowait
>> -device
>> virtserialport,bus=virtio-serial0.0,nr=2,chardev= charchannel1,id=channel1,name= org.qemu.guest_agent.0
>> -chardev spicevmc,id=charchannel2,name=vdagent -device
>> virtserialport,bus=virtio-serial0.0,nr=3,chardev= charchannel2,id=channel2,name= com.redhat.spice.0
>> -chardev
>> socket,id=charchannel3,path=/var/lib/libvirt/qemu/channels/ 4013c829-c9d7-4b72-90d5- 6fe58137504c.org.ovirt.hosted- engine-setup.0,server,nowait
>> -device
>> virtserialport,bus=virtio-serial0.0,nr=4,chardev= charchannel3,id=channel3,name= org.ovirt.hosted-engine-setup. 0
>> -chardev pty,id=charconsole0 -device
>> virtconsole,chardev=charconsole0,id=console0 -spice
>> tls-port=5904,addr=0,x509-dir=/etc/pki/vdsm/libvirt-spice, tls-channel=default,seamless- migration=on
>> -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -object
>> rng-random,id=objrng0,filename=/dev/urandom -device
>> virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x5 -msg timestamp=on
>>
>> ==> /var/log/messages <==
>> Jan 12 11:55:05 cultivar0 NetworkManager[1092]: <info> [1515772505.3689]
>> device (vnet4): state change: unmanaged -> unavailable (reason
>> 'connection-assumed') [10 20 41]
>> Jan 12 11:55:05 cultivar0 NetworkManager[1092]: <info> [1515772505.3702]
>> device (vnet4): state change: unavailable -> disconnected (reason 'none')
>> [20 30 0]
>> Jan 12 11:55:05 cultivar0 systemd-machined: New machine qemu-118-Cultivar.
>> Jan 12 11:55:05 cultivar0 systemd: Started Virtual Machine
>> qemu-118-Cultivar.
>> Jan 12 11:55:05 cultivar0 systemd: Starting Virtual Machine
>> qemu-118-Cultivar.
>>
>> ==> /var/log/libvirt/qemu/Cultivar.log <==
>> 2018-01-12T15:55:05.586827Z qemu-kvm: -chardev pty,id=charconsole0: char
>> device redirected to /dev/pts/1 (label charconsole0)
>>
>> ==> /var/log/messages <==
>> Jan 12 11:55:05 cultivar0 kvm: 5 guests now active
>>
>> On Fri, Jan 12, 2018 at 11:36 AM, Jayme <jaymef@gmail.com> wrote:
>>>
>>> Yeah I am in global maintenance:
>>>
>>> state=GlobalMaintenance
>>>
>>> host0: {"reason": "vm not running on this host", "health": "bad", "vm":
>>> "down", "detail": "unknown"}
>>> host2: {"reason": "vm not running on this host", "health": "bad", "vm":
>>> "down", "detail": "unknown"}
>>> host3: {"reason": "vm not running on this host", "health": "bad", "vm":
>>> "down", "detail": "unknown"}
>>>
>>> I understand the lock is an issue, I'll try to make sure it is fully
>>> stopped on all three before starting but I don't think that is the issue at
>>> hand either. What concerns me is mostly that it seems to be unable to read
>>> the meta data, I think that might be the heart of the problem but I'm not
>>> sure what is causing it.
>>>
>>> On Fri, Jan 12, 2018 at 11:33 AM, Martin Sivak <msivak@redhat.com> wrote:
>>>>
>>>> > On all three hosts I ran hosted-engine --vm-shutdown; hosted-engine
>>>> > --vm-poweroff
>>>>
>>>> Are you in global maintenance? I think you were in one of the previous
>>>> emails, but worth checking.
>>>>
>>>> > I started ovirt-ha-broker with systemctl as root user but it does
>>>> > appear to be running under vdsm:
>>>>
>>>> That is the correct behavior.
>>>>
>>>> > libvirtError: resource busy: Failed to acquire lock: Lease is held by
>>>> > another host
>>>>
>>>> sanlock seems to think the VM runs somewhere and it is possible that
>>>> some other host tried to start the VM as well unless you are in global
>>>> maintenance (that is why I asked the first question here).
>>>>
>>>> Martin
>>>>
>>>> On Fri, Jan 12, 2018 at 4:28 PM, Jayme <jaymef@gmail.com> wrote:
>>>> > Martin,
>>>> >
>>>> > Thanks so much for keeping with me, this is driving me crazy! I
>>>> > really do
>>>> > appreciate it, thanks again
>>>> >
>>>> > Let's go through this:
>>>> >
>>>> > HE VM is down - YES
>>>> >
>>>> >
>>>> > HE agent fails when opening metadata using the symlink - YES
>>>> >
>>>> >
>>>> > the symlink is there and readable by vdsm:kvm - it appears to be:
>>>> >
>>>> >
>>>> > lrwxrwxrwx. 1 vdsm kvm 159 Jan 10 21:20
>>>> > 14a20941-1b84-4b82-be8f-ace38d7c037a
>>>> > ->
>>>> >
>>>> > /rhev/data-center/mnt/cultivar0.grove.silverorange. com:_exports_hosted__engine/ 248f46f0-d793-4581-9810- c9d965e2f286/images/14a20941- 1b84-4b82-be8f-ace38d7c037a
>>>> >
>>>> >
>>>> > And the files in the linked directory exist and have vdsm:kvm perms as
>>>> > well:
>>>> >
>>>> >
>>>> > # cd
>>>> >
>>>> > /rhev/data-center/mnt/cultivar0.grove.silverorange. com:_exports_hosted__engine/ 248f46f0-d793-4581-9810- c9d965e2f286/images/14a20941- 1b84-4b82-be8f-ace38d7c037a
>>>> >
>>>> > [root@cultivar0 14a20941-1b84-4b82-be8f-ace38d7c037a]# ls -al
>>>> >
>>>> > total 2040
>>>> >
>>>> > drwxr-xr-x. 2 vdsm kvm 4096 Jan 12 10:51 .
>>>> >
>>>> > drwxr-xr-x. 8 vdsm kvm 4096 Feb 3 2016 ..
>>>> >
>>>> > -rw-rw----. 1 vdsm kvm 1028096 Jan 12 11:19
>>>> > 8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8
>>>> >
>>>> > -rw-rw----. 1 vdsm kvm 1048576 Feb 3 2016
>>>> > 8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8.lease
>>>> >
>>>> > -rw-r--r--. 1 vdsm kvm 283 Feb 3 2016
>>>> > 8582bdfc-ef54-47af-9f1e-f5b7ec1f1cf8.meta
>>>> >
>>>> >
>>>> > I started ovirt-ha-broker with systemctl as root user but it does
>>>> > appear to
>>>> > be running under vdsm:
>>>> >
>>>> >
>>>> > vdsm 16928 0.6 0.0 1618244 43328 ? Ssl 10:33 0:18
>>>> > /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker
>>>> >
>>>> >
>>>> >
>>>> > Here is something I tried:
>>>> >
>>>> >
>>>> > - On all three hosts I ran hosted-engine --vm-shutdown; hosted-engine
>>>> > --vm-poweroff
>>>> >
>>>> > - On HOST0 (cultivar0) I disconnected and reconnected storage using
>>>> > hosted-engine
>>>> >
>>>> > - Tried starting up the hosted VM on cultivar0 while tailing the logs:
>>>> >
>>>> >
>>>> > # hosted-engine --vm-start
>>>> >
>>>> > VM exists and is down, cleaning up and restarting
>>>> >
>>>> >
>>>> >
>>>> > ==> /var/log/vdsm/vdsm.log <==
>>>> >
>>>> > jsonrpc/2::ERROR::2018-01-12
>>>> > 11:27:27,194::vm::1766::virt.vm::(_getRunningVmStats)
>>>> > (vmId='4013c829-c9d7-4b72-90d5-6fe58137504c') Error fetching vm stats
>>>> >
>>>> > Traceback (most recent call last):
>>>> >
>>>> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1762,
>>>> > in
>>>> > _getRunningVmStats
>>>> >
>>>> > vm_sample.interval)
>>>> >
>>>> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py" , line
>>>> > 45, in
>>>> > produce
>>>> >
>>>> > networks(vm, stats, first_sample, last_sample, interval)
>>>> >
>>>> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py" , line
>>>> > 322, in
>>>> > networks
>>>> >
>>>> > if nic.name.startswith('hostdev'):
>>>> >
>>>> > AttributeError: name
>>>> >
>>>> > jsonrpc/3::ERROR::2018-01-12
>>>> > 11:27:27,221::__init__::611::jsonrpc.JsonRpcServer::(_ handle_request)
>>>> > Internal server error
>>>> >
>>>> > Traceback (most recent call last):
>>>> >
>>>> > File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__. py", line
>>>> > 606,
>>>> > in _handle_request
>>>> >
>>>> > res = method(**params)
>>>> >
>>>> > File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line
>>>> > 201, in
>>>> > _dynamicMethod
>>>> >
>>>> > result = fn(*methodArgs)
>>>> >
>>>> > File "<string>", line 2, in getAllVmIoTunePolicies
>>>> >
>>>> > File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48,
>>>> > in
>>>> > method
>>>> >
>>>> > ret = func(*args, **kwargs)
>>>> >
>>>> > File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1354, in
>>>> > getAllVmIoTunePolicies
>>>> >
>>>> > io_tune_policies_dict = self._cif.getAllVmIoTunePolicies()
>>>> >
>>>> > File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 524,
>>>> > in
>>>> > getAllVmIoTunePolicies
>>>> >
>>>> > 'current_values': v.getIoTune()}
>>>> >
>>>> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3481,
>>>> > in
>>>> > getIoTune
>>>> >
>>>> > result = self.getIoTuneResponse()
>>>> >
>>>> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3500,
>>>> > in
>>>> > getIoTuneResponse
>>>> >
>>>> > res = self._dom.blockIoTune(
>>>> >
>>>> > File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain. py", line
>>>> > 47,
>>>> > in __getattr__
>>>> >
>>>> > % self.vmid)
>>>> >
>>>> > NotConnectedError: VM '4013c829-c9d7-4b72-90d5-6fe58137504c' was not
>>>> > defined
>>>> > yet or was undefined
>>>> >
>>>> >
>>>> > ==> /var/log/messages <==
>>>> >
>>>> > Jan 12 11:27:27 cultivar0 journal: vdsm jsonrpc.JsonRpcServer ERROR
>>>> > Internal
>>>> > server error#012Traceback (most recent call last):#012 File
>>>> > "/usr/lib/python2.7/site-packages/yajsonrpc/__init__. py", line 606, in
>>>> > _handle_request#012 res = method(**params)#012 File
>>>> > "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 201, in
>>>> > _dynamicMethod#012 result = fn(*methodArgs)#012 File "<string>",
>>>> > line 2,
>>>> > in getAllVmIoTunePolicies#012 File
>>>> > "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in
>>>> > method#012 ret = func(*args, **kwargs)#012 File
>>>> > "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1354, in
>>>> > getAllVmIoTunePolicies#012 io_tune_policies_dict =
>>>> > self._cif.getAllVmIoTunePolicies()#012 File
>>>> > "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 524, in
>>>> > getAllVmIoTunePolicies#012 'current_values': v.getIoTune()}#012
>>>> > File
>>>> > "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3481, in
>>>> > getIoTune#012 result = self.getIoTuneResponse()#012 File
>>>> > "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3500, in
>>>> > getIoTuneResponse#012 res = self._dom.blockIoTune(#012 File
>>>> > "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain. py", line 47, in
>>>> > __getattr__#012 % self.vmid)#012NotConnectedError: VM
>>>> > '4013c829-c9d7-4b72-90d5-6fe58137504c' was not defined yet or was
>>>> > undefined
>>>> >
>>>> > Jan 12 11:27:27 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered
>>>> > blocking
>>>> > state
>>>> >
>>>> > Jan 12 11:27:27 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered
>>>> > disabled
>>>> > state
>>>> >
>>>> > Jan 12 11:27:27 cultivar0 kernel: device vnet4 entered promiscuous
>>>> > mode
>>>> >
>>>> > Jan 12 11:27:27 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered
>>>> > blocking
>>>> > state
>>>> >
>>>> > Jan 12 11:27:27 cultivar0 kernel: ovirtmgmt: port 6(vnet4) entered
>>>> > forwarding state
>>>> >
>>>> > Jan 12 11:27:27 cultivar0 lldpad: recvfrom(Event interface): No buffer
>>>> > space
>>>> > available
>>>> >
>>>> > Jan 12 11:27:27 cultivar0 NetworkManager[1092]: <info>
>>>> > [1515770847.4264]
>>>> > manager: (vnet4): new Tun device
>>>> > (/org/freedesktop/NetworkManager/Devices/135)
>>>> >
>>>> > Jan 12 11:27:27 cultivar0 NetworkManager[1092]: <info>
>>>> > [1515770847.4342]
>>>> > device (vnet4): state change: unmanaged -> unavailable (reason
>>>> > 'connection-assumed') [10 20 41]
>>>> >
>>>> > Jan 12 11:27:27 cultivar0 NetworkManager[1092]: <info>
>>>> > [1515770847.4353]
>>>> > device (vnet4): state change: unavailable -> disconnected (reason
>>>> > 'none')
>>>> > [20 30 0]
>>>> >
>>>> >
>>>> > ==> /var/log/libvirt/qemu/Cultivar.log <==
>>>> >
>>>> > 2018-01-12 15:27:27.435+0000: starting up libvirt version: 3.2.0,
>>>> > package:
>>>> > 14.el7_4.7 (CentOS BuildSystem <http://bugs.centos.org>,
>>>> > 2018-01-04-19:31:34, c1bm.rdu2.centos.org), qemu version:
>>>> > 2.9.0(qemu-kvm-ev-2.9.0-16.el7_4.13.1), hostname:
>>>> > cultivar0.grove.silverorange.com
>>>> >
>>>> > LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
>>>> > QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name
>>>> > guest=Cultivar,debug-threads=on -S -object
>>>> >
>>>> > secret,id=masterKey0,format=raw,file=/var/lib/libvirt/ qemu/domain-114-Cultivar/ master-key.aes
>>>> > -machine pc-i440fx-rhel7.3.0,accel=kvm,usb=off,dump-guest-core=off
>>>> > -cpu
>>>> > Conroe -m 8192 -realtime mlock=off -smp
>>>> > 2,maxcpus=16,sockets=16,cores=1,threads=1 -uuid
>>>> > 4013c829-c9d7-4b72-90d5-6fe58137504c -smbios