April 2019 - Users - oVirt List Archives

Re: Change IP Node and Manager
by Strahil 16 Apr '19

16 Apr '19

I think that you need to: 1. Put in maintenance 2. Unregister 3. Remove host 4. Change IP 5. Add host to oVirt again Please, someone more experienced to share any thoughts. Best Regards, Strahil NikolovOn Apr 15, 2019 22:49, "Sebastian Antunez N." <antunez.sebastian(a)gmail.com> wrote: > > Hello Guys > > Have 8 host with Ovirt 4.1 and need change the IP to all node. > > There is a procedure to make the change of IP, I have searched for information but I can not find a process to perform. > > Someone could help me to know how I can change the IP in the nodes, I know that I must put the nodes in maintenance, but I do not know if I should change the manager first, add an additional IP and then add the nodes, etc. > > Thanks for the help > > Sebastian

1 0

Re: hosted engine does not start
by Strahil 16 Apr '19

16 Apr '19

Try with the VNC console 'hosted-engine --add-console-password' Then connect on the IP:port that the command replies and check what is going on. Maybe, you will need a rescue DVD and mount all filesystems and dismount them. After that, just power it off and power it on regularly. If you can't use custom engine config, use the xml definition in the VDSM log. You will also need this alias: alias virsh='virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf' , so you can use virsh freely (define/start/destroy). Best Regards, Strahil NikolovOn Apr 15, 2019 22:35, Stefan Wolf <shb256(a)gmail.com> wrote: > > Hello all, > > > > after a powerloss the hosted engine won’t start up anymore. > > I ‘ve the current ovirt installed. > > Storage is glusterfs und it is up and running > > > > It is trying to start up hosted engine but it does not work, but I can’t see where the problem is. > > > > [root@kvm320 ~]# hosted-engine --vm-status > > > > > > --== Host 1 status ==-- > > > > conf_on_shared_storage : True > > Status up-to-date : True > > Hostname : kvm380.durchhalten.intern > > Host ID : 1 > > Engine status : {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "Down"} > > Score : 1800 > > stopped : False > > Local maintenance : False > > crc32 : 3ad6d0bd > > local_conf_timestamp : 14594 > > Host timestamp : 14594 > > Extra metadata (valid at timestamp): > > metadata_parse_version=1 > > metadata_feature_version=1 > > timestamp=14594 (Mon Apr 15 21:25:12 2019) > > host-id=1 > > score=1800 > > vm_conf_refresh_time=14594 (Mon Apr 15 21:25:12 2019) > > conf_on_shared_storage=True > > maintenance=False > > state=GlobalMaintenance > > stopped=False > > > > > > --== Host 2 status ==-- > > > > conf_on_shared_storage : True > > Status up-to-date : True > > Hostname : kvm320.durchhalten.intern > > Host ID : 2 > > Engine status : {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"} > > Score : 0 > > stopped : False > > Local maintenance : False > > crc32 : e7d4840d > > local_conf_timestamp : 21500 > > Host timestamp : 21500 > > Extra metadata (valid at timestamp): > > metadata_parse_version=1 > > metadata_feature_version=1 > > timestamp=21500 (Mon Apr 15 21:25:22 2019) > > host-id=2 > > score=0 > > vm_conf_refresh_time=21500 (Mon Apr 15 21:25:22 2019) > > conf_on_shared_storage=True > > maintenance=False > > state=ReinitializeFSM > > stopped=False > > > > > > --== Host 3 status ==-- > > > > conf_on_shared_storage : True > > Status up-to-date : True > > Hostname : kvm360.durchhalten.intern > > Host ID : 3 > > Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"} > > Score : 1800 > > stopped : False > > Local maintenance : False > > crc32 : cf9221cb > > local_conf_timestamp : 22121 > > Host timestamp : 22120 > > Extra metadata (valid at timestamp): > > metadata_parse_version=1 > > metadata_feature_version=1 > > timestamp=22120 (Mon Apr 15 21:25:18 2019) > > host-id=3 > > score=1800 > > vm_conf_refresh_time=22121 (Mon Apr 15 21:25:18 2019) > > conf_on_shared_storage=True > > maintenance=False > > state=GlobalMaintenance > > stopped=False > > > > [root@kvm320 ~]# virsh -r list > > Id Name Status > > ---------------------------------------------------- > > 6 HostedEngine laufend > > > > [root@kvm320 ~]# hosted-engine --console > > The engine VM is running on this host > > Verbunden mit der Domain: HostedEngine > > Escape-Zeichen ist ^] > > Fehler: Interner Fehler: Zeichengerät <null> kann nicht gefunden warden > > > > In engish it should be this > > > > [root@mgmt~]# hosted-engine --console > The engine VM is running on this host > Connected to domain HostedEngine > Escape character is ^] > error: internal error: cannot find character device > > > > This is in the log > > > > [root@kvm320 ~]# tail -f /var/log/ovirt-hosted-engine-ha/agent.log > > MainThread::INFO::2019-04-15 21:28:33,032::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 1800) > > MainThread::INFO::2019-04-15 21:28:43,050::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up.. > > MainThread::INFO::2019-04-15 21:28:43,165::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 1800) > > MainThread::INFO::2019-04-15 21:28:53,183::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up.. > > MainThread::INFO::2019-04-15 21:28:53,300::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 1800) > > MainThread::INFO::2019-04-15 21:29:03,317::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up.. > > MainThread::INFO::2019-04-15 21:29:03,434::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 1800) > > MainThread::INFO::2019-04-15 21:29:13,453::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up.. > > MainThread::INFO::2019-04-15 21:29:13,571::states::136::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Penalizing score by 1600 due to gateway status > > MainThread::INFO::2019-04-15 21:29:13,571::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 1800) > > MainThread::INFO::2019-04-15 21:29:22,589::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) VM is powering up.. > > MainThread::INFO::2019-04-15 21:29:22,712::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 1800) > > > > But it is not reachable over the network > > > > [root@kvm320 ~]# ping 192.168.200.211 > > PING 192.168.200.211 (192.168.200.211) 56(84) bytes of data. > > From 192.168.200.231 icmp_seq=1 Destination Host Unreachable > > From 192.168.200.231 icmp_seq=2 Destination Host Unreachable > > From 192.168.200.231 icmp_seq=3 Destination Host Unreachable > > From 192.168.200.231 icmp_seq=4 Destination Host Unreachable > > > > I tried to stop and start the vm again, but it didn’t helped > > > > Maybe someone can give me some advice how to get the hosted engine running again > > > > Thx by stefan

1 0

oVirt 4.3.2 - Cannot update Host via UI
by Strahil Nikolov 16 Apr '19

16 Apr '19

Hello guys, I have the following issue after I have successfully updated by engine from 4.3.1 to 4.3.2 - I canont update any host via the UI. The event log show startup of the update , but there is no process running on the Host, yum.log is not updated and engine log doesn't show anything meaningful. Any hint where to look for ? Thanks in advance. Best Regards,Strahil Nikolov

4 3

How to fix ovn apparent inconsistency?
by Gianluca Cecchi 16 Apr '19

16 Apr '19

Hello, passing from old manual to current OVN in 4.3.1 it seems I have some problems with OVN now. I cannot assign network on OVN to VM (powered on or off doesn't change). When I add//edit a vnic, they are not on the possible choices Environment composed by three hosts and one engine (external on vSphere). The mgmt network during time has been configured on network named ovirtmgmntZ2Z3 On engine it seems there are 2 switches for every defined ovn network (ovn192 and ovn172) Below some output of commands in case any inconsistency has remained and I can purge it. Thanks in advance. Gianluca - On manager ovmgr1: [root@ovmgr1 ~]# ovs-vsctl show eae54ff9-b86c-4050-8241-46f44336ba94 ovs_version: "2.10.1" [root@ovmgr1 ~]# [root@ovmgr1 ~]# ovn-nbctl show switch 32367d8a-460f-4447-b35a-abe9ea5187e0 (ovn192) port affc5570-3e5a-439c-9fdf-d75d6810e3a3 addresses: ["00:1a:4a:17:01:73"] port f639d541-2118-4c24-b478-b7a586eb170c addresses: ["00:1a:4a:17:01:75"] switch 6110649a-db2b-4de7-8fbc-601095cfe510 (ovn192) switch 64c4c17f-cd67-4e29-939e-2b952495159f (ovn172) port 32c348d9-12e9-4bcf-a43f-69338c887cfc addresses: ["00:1a:4a:17:01:72 dynamic"] port 3c77c2ea-de00-43f9-a5c5-9b3ffea5ec69 addresses: ["00:1a:4a:17:01:74 dynamic"] switch 04501f6b-3977-4ba1-9ead-7096768d796d (ovn172) port 0a2a47bc-ea0d-4f1d-8f49-ec903e519983 addresses: ["00:1a:4a:17:01:65 dynamic"] port 8fc7bed4-7663-4903-922b-05e490c6a5a1 addresses: ["00:1a:4a:17:01:64 dynamic"] port f2b64f89-b719-484c-ac02-2a1ac8eaacdb addresses: ["00:1a:4a:17:01:59 dynamic"] port f7389c88-1ea1-47c2-92fd-6beffb2e2190 addresses: ["00:1a:4a:17:01:58 dynamic"] [root@ovmgr1 ~]# - On host ov200 (10.4.192.32 on ovirtmgmntZ2Z3): [root@ov200 ~]# ovs-vsctl show ae0a1256-7250-46a2-a1b6-8f0ae6105c20 Bridge br-int fail_mode: secure Port br-int Interface br-int type: internal Port "ovn-ddecf0-0" Interface "ovn-ddecf0-0" type: geneve options: {csum="true", key=flow, remote_ip="10.4.192.33"} Port "ovn-b8872a-0" Interface "ovn-b8872a-0" type: geneve options: {csum="true", key=flow, remote_ip="10.4.192.34"} ovs_version: "2.10.1" [root@ov200 ~]# - On host ov300 (10.4.192.33 on ovirtmgmntZ2Z3): [root@ov300 ~]# ovs-vsctl show f1a41e9c-16fb-4aa2-a386-2f366ade4d3c Bridge br-int fail_mode: secure Port br-int Interface br-int type: internal Port "ovn-b8872a-0" Interface "ovn-b8872a-0" type: geneve options: {csum="true", key=flow, remote_ip="10.4.192.34"} Port "ovn-1dce5b-0" Interface "ovn-1dce5b-0" type: geneve options: {csum="true", key=flow, remote_ip="10.4.192.32"} ovs_version: "2.10.1" [root@ov300 ~]# - On host ov301 (10.4.192.34 on ovirtmgmntZ2Z3): [root@ov301 ~]# ovs-vsctl show 3a38c5bb-0abf-493d-a2e6-345af8aedfe3 Bridge br-int fail_mode: secure Port "ovn-1dce5b-0" Interface "ovn-1dce5b-0" type: geneve options: {csum="true", key=flow, remote_ip="10.4.192.32"} Port "ovn-ddecf0-0" Interface "ovn-ddecf0-0" type: geneve options: {csum="true", key=flow, remote_ip="10.4.192.33"} Port br-int Interface br-int type: internal ovs_version: "2.10.1" [root@ov301 ~]# In web admin gui: In network -> networks -> - ovn192 Id: 8fd63a10-a2ba-4c56-a8e0-0bc8d70be8b5 VDSM Name: ovn192 External ID: 32367d8a-460f-4447-b35a-abe9ea5187e0 - ovn172 Id: 7546d5d3-a0e3-40d5-9d22-cf355da47d3a VDSM Name: ovn172 External ID: 64c4c17f-cd67-4e29-939e-2b952495159f

5 36

oVirt 4.3.2.1-1.el7 Errors at VM boot
by Wood Peter 15 Apr '19

15 Apr '19

Hi all, A few weeks ago I did a clean install of the latest oVirt-4.3.2 and imported some VMs from oVirt-3. Three nodes running oVirt Node and oVirt Engine installed on a separate system. I noticed that some times some VMs will boot successfully but the Web UI will still show "Powering UP" for days after the VM has been up. I can power down the VM and power back up and it may update the Web UI status to UP. While debugging the above issue I noticed that some VMs will trigger errors during boot. I can power on a VM on one node, see the errors below started happening every 4-5 seconds, then power down the VM, errors stop, then power up the VM on a different node without a problem. Another VM though may trigger the errors on the same node. Everything is very inconsistent. I can't find a pattern. I tried different VMs, different nodes, and I'm getting mixed results. Hopefully the errors will give some clue. Here is what I'm seeing scrolling every 4-5 seconds: ------------------------- On oVirt Node: ==> vdsm.log <== 2019-04-12 10:50:31,543-0700 ERROR (jsonrpc/3) [jsonrpc.JsonRpcServer] Internal server error (__init__:350) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 194, in _dynamicMethod result = fn(*methodArgs) File "<string>", line 2, in getAllVmStats File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1388, in getAllVmStats statsList = self._cif.getAllVmStats() File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 567, in getAllVmStats return [v.getStats() for v in self.vmContainer.values()] File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1766, in getStats oga_stats = self._getGuestStats() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1967, in _getGuestStats stats = self.guestAgent.getGuestInfo() File "/usr/lib/python2.7/site-packages/vdsm/virt/guestagent.py", line 505, in getGuestInfo del qga['appsList'] KeyError: 'appsList' ==> mom.log <== 2019-04-12 10:50:31,547 - mom.VdsmRpcBase - ERROR - Command Host.getAllVmStats with args {} failed: (code=-32603, message=Internal JSON-RPC error: {'reason': "'appsList'"}) ---------------------- On oVirt Engine 2019-04-12 10:50:35,692-07 WARN [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-53) [] Unexpected return value: Status [code=-32603, message=Internal JSON-RPC error: {'reason': "'appsList'"}] 2019-04-12 10:50:35,693-07 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-53) [] Failed in 'GetAllVmStatsVDS' method 2019-04-12 10:50:35,693-07 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-53) [] Command 'GetAllVmStatsVDSCommand(HostName = sdod-ovnode-03, VdsIdVDSCommandParametersBase:{hostId='12e38ad3-6327-4c94-8be4-88912d283729'})' execution failed: VDSGenericException: VDSErrorException: Failed to GetAllVmStatsVDS, error = Internal JSON-RPC error: {'reason': "'appsList'"}, code = -32603 Thank you, -- Peter

2 2

Change IP Node and Manager
by Sebastian Antunez N. 15 Apr '19

15 Apr '19

Hello Guys Have 8 host with Ovirt 4.1 and need change the IP to all node. There is a procedure to make the change of IP, I have searched for information but I can not find a process to perform. Someone could help me to know how I can change the IP in the nodes, I know that I must put the nodes in maintenance, but I do not know if I should change the manager first, add an additional IP and then add the nodes, etc. Thanks for the help Sebastian

1 0

Poor I/O Performance (again...)
by Jim Kusznir 15 Apr '19

15 Apr '19

Hi all: I've had I/O performance problems pretty much since the beginning of using oVirt. I've applied several upgrades as time went on, but strangely, none of them have alleviated the problem. VM disk I/O is still very slow to the point that running VMs is often painful; it notably affects nearly all my VMs, and makes me leary of starting any more. I'm currently running 12 VMs and the hosted engine on the stack. My configuration started out with 1Gbps networking and hyperconverged gluster running on a single SSD on each node. It worked, but I/O was painfully slow. I also started running out of space, so I added an SSHD on each node, created another gluster volume, and moved VMs over to it. I also ran that on a dedicated 1Gbps network. I had recurring disk failures (seems that disks only lasted about 3-6 months; I warrantied all three at least once, and some twice before giving up). I suspect the Dell PERC 6/i was partly to blame; the raid card refused to see/acknowledge the disk, but plugging it into a normal PC showed no signs of problems. In any case, performance on that storage was notably bad, even though the gig-e interface was rarely taxed. I put in 10Gbps ethernet and moved all the storage on that none the less, as several people here said that 1Gbps just wasn't fast enough. Some aspects improved a bit, but disk I/O is still slow. And I was still having problems with the SSHD data gluster volume eating disks, so I bought a dedicated NAS server (supermicro 12 disk dedicated FreeNAS NFS storage system on 10Gbps ethernet). Set that up. I found that it was actually FASTER than the SSD-based gluster volume, but still slow. Lately its been getting slower, too...Don't know why. The FreeNAS server reports network loads around 4MB/s on its 10Gbe interface, so its not network constrained. At 4MB/s, I'd sure hope the 12 spindle SAS interface wasn't constrained either..... (and disk I/O operations on the NAS itself complete much faster). So, running a test on my NAS against an ISO file I haven't accessed in months: # dd if=en_windows_server_2008_r2_standard_enterprise_datacenter_and_web_x64_dvd_x15-59754.iso of=/dev/null bs=1024k count=500 500+0 records in 500+0 records out 524288000 bytes transferred in 2.459501 secs (213168465 bytes/sec) Running it on one of my hosts: root@unifi:/home/kusznir# time dd if=/dev/sda of=/dev/null bs=1024k count=500 500+0 records in 500+0 records out 524288000 bytes (524 MB, 500 MiB) copied, 7.21337 s, 72.7 MB/s (I don't know if this is a true apples to apples comparison, as I don't have a large file inside this VM's image). Even this is faster than I often see. I have a VoIP Phone server running as a VM. Voicemail and other recordings usually fail due to IO issues opening and writing the files. Often, the first 4 or so seconds of the recording is missed; sometimes the entire thing just fails. I didn't use to have this problem, but its definately been getting worse. I finally bit the bullet and ordered a physical server dedicated for my VoIP System...But I still want to figure out why I'm having all these IO problems. I read on the list of people running 30+ VMs...I feel that my IO can't take any more VMs with any semblance of reliability. We have a Quickbooks server on here too (windows), and the performance is abysmal; my CPA is charging me extra because of all the lost staff time waiting on the system to respond and generate reports..... I'm at my whits end...I started with gluster on SSD with 1Gbps network, migrated to 10Gbps network, and now to dedicated high performance NAS box over NFS, and still have performance issues.....I don't know how to troubleshoot the issue any further, but I've never had these kinds of issues when I was playing with other VM technologies. I'd like to get to the point where I can resell virtual servers to customers, but I can't do so with my current performance levels. I'd greatly appreciate help troubleshooting this further. --Jim

3 6

Re: Tuning Gluster Writes
by Strahil 15 Apr '19

15 Apr '19

Hi, What is your dirty cache settings on the gluster servers ? Best Regards, Strahil NikolovOn Apr 13, 2019 00:44, Alex McWhirter <alex(a)triadic.us> wrote: > > I have 8 machines acting as gluster servers. They each have 12 drives > raid 50'd together (3 sets of 4 drives raid 5'd then 0'd together as > one). > > They connect to the compute hosts and to each other over lacp'd 10GB > connections split across two cisco nexus switched with VPC. > > Gluster has the following set. > > performance.write-behind-window-size: 4MB > performance.flush-behind: on > performance.stat-prefetch: on > server.event-threads: 4 > client.event-threads: 8 > performance.io-thread-count: 32 > network.ping-timeout: 30 > cluster.granular-entry-heal: enable > performance.strict-o-direct: on > storage.owner-gid: 36 > storage.owner-uid: 36 > features.shard: on > cluster.shd-wait-qlength: 10000 > cluster.shd-max-threads: 8 > cluster.locking-scheme: granular > cluster.data-self-heal-algorithm: full > cluster.server-quorum-type: server > cluster.quorum-type: auto > cluster.eager-lock: enable > network.remote-dio: off > performance.low-prio-threads: 32 > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > auth.allow: * > user.cifs: off > transport.address-family: inet > nfs.disable: off > performance.client-io-threads: on > > > I have the following sysctl values on gluster client and servers, using > libgfapi, MTU 9K > > net.core.rmem_max = 134217728 > net.core.wmem_max = 134217728 > net.ipv4.tcp_rmem = 4096 87380 134217728 > net.ipv4.tcp_wmem = 4096 65536 134217728 > net.core.netdev_max_backlog = 300000 > net.ipv4.tcp_moderate_rcvbuf =1 > net.ipv4.tcp_no_metrics_save = 1 > net.ipv4.tcp_congestion_control=htcp > > reads with this setup are perfect, benchmarked in VM to be about 770MB/s > sequential with disk access times of < 1ms. Writes on the other hand are > all over the place. They peak around 320MB/s sequential write, which is > what i expect but it seems as if there is some blocking going on. > > During the write test i will hit 320MB/s briefly, then 0MB/s as disk > access time shoot to over 3000ms, then back to 320MB/s. It averages out > to about 110MB/s afterwards. > > Gluster version is 3.12.15 ovirt is 4.2.7.5 > > Any ideas on what i could tune to eliminate or minimize that blocking? > _______________________________________________ > Users mailing list -- users(a)ovirt.org > To unsubscribe send an email to users-leave(a)ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Z7F72BKYKAGIC…

4 9

oVirt 4.3.2 missing/wrong status of VM
by Strahil Nikolov 15 Apr '19

15 Apr '19

As I couldn't find the exact mail thread, I'm attaching my /usr/lib/python2.7/site-packages/vdsm/virt/guestagent.py which fixes the missing/wrong status of VMs. You will need to restart vdsmd (I'm not sure how safe is that with running guests) in order to start working. Best Regards,Strahil Nikolov

2 1

Second host fail to activate (hosted-engine)
by Ricardo Alonso 15 Apr '19

15 Apr '19

After installing the second host via the web gui (4.3.2.1-1.el7), it fails to activate telling that wasn't possible to connect to the storage pool default (glusterfs). Those are the logs: vdsm.log 2019-04-09 15:54:07,409-0400 INFO (Reactor thread) [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:58130 (protocoldetector:61) 2019-04-09 15:54:07,419-0400 INFO (Reactor thread) [ProtocolDetector.Detector] Detected protocol stomp from ::1:58130 (protocoldetector:125) 2019-04-09 15:54:07,419-0400 INFO (Reactor thread) [Broker.StompAdapter] Processing CONNECT request (stompserver:95) 2019-04-09 15:54:07,420-0400 INFO (JsonRpc (StompReactor)) [Broker.StompAdapter] Subscribe command received (stompserver:124) 2019-04-09 15:54:07,461-0400 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.ping2 succeeded in 0.00 seconds (__init__:312) 2019-04-09 15:54:07,466-0400 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.ping2 succeeded in 0.00 seconds (__init__:312) 2019-04-09 15:54:07,469-0400 INFO (jsonrpc/0) [vdsm.api] START getStorageDomainInfo(sdUUID=u'd99fb087-66d5-4adf-9c0c-80e60de17917', options=None) from=::1,58130, task_id=00c843c2-ab43-4813-9ded-29f6742c33b2 (api:48) 2019-04-09 15:54:07,484-0400 INFO (jsonrpc/0) [vdsm.api] FINISH getStorageDomainInfo error='VERSION' from=::1,58130, task_id=00c843c2-ab43-4813-9ded-29f6742c33b2 (api:52) 2019-04-09 15:54:07,484-0400 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='00c843c2-ab43-4813-9ded-29f6742c33b2') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in getStorageDomainInfo File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2741, in getStorageDomainInfo dom = self.validateSdUUID(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 305, in validateSdUUID sdDom = sdCache.produce(sdUUID=sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce domain.getRealDomain() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain return self._cache._realProduce(self._sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce domain = self._findDomain(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain return findMethod(sdUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/glusterSD.py", line 56, in findDomain return GlusterStorageDomain(GlusterStorageDomain.findDomainPath(sdUUID)) File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 394, in __init__ manifest = self.manifestClass(domainPath) File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 179, in __init__ sd.StorageDomainManifest.__init__(self, sdUUID, domaindir, metadata) File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 332, in __init__ self._domainLock = self._makeDomainLock() File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 553, in _makeDomainLock domVersion = self.getVersion() File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 424, in getVersion return self.getMetaParam(DMDK_VERSION) File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 421, in getMetaParam return self._metadata[key] File "/usr/lib/python2.7/site-packages/vdsm/storage/persistent.py", line 91, in __getitem__ return dec(self._dict[key]) File "/usr/lib/python2.7/site-packages/vdsm/storage/persistent.py", line 202, in __getitem__ return self._metadata[key] KeyError: 'VERSION' 2019-04-09 15:54:07,484-0400 INFO (jsonrpc/0) [storage.TaskManager.Task] (Task='00c843c2-ab43-4813-9ded-29f6742c33b2') aborting: Task is aborted: u"'VERSION'" - code 100 (task:1181) 2019-04-09 15:54:07,484-0400 ERROR (jsonrpc/0) [storage.Dispatcher] FINISH getStorageDomainInfo error='VERSION' (dispatcher:87) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/dispatcher.py", line 74, in wrapper result = ctask.prepare(func, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 108, in wrapper return m(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1189, in prepare raise self.error KeyError: 'VERSION' 2019-04-09 15:54:07,484-0400 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 350) in 0.01 seconds (__init__:312) 2019-04-09 15:54:07,502-0400 INFO (jsonrpc/3) [vdsm.api] START connectStorageServer(domType=7, spUUID=u'00000000-0000-0000-0000-000000000000', conList=[{u'id': u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d', u'vfs_type': u'glusterfs', u'connection': u'poseidon:/engine', u'user': u'kvm'}], options=None) from=::1,58130, task_id=d71268fe-0088-44e1-99e8-7bcc868b3b2e (api:48) 2019-04-09 15:54:07,521-0400 INFO (jsonrpc/3) [vdsm.api] FINISH connectStorageServer return={'statuslist': [{'status': 0, 'id': u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}]} from=::1,58130, task_id=d71268fe-0088-44e1-99e8-7bcc868b3b2e (api:54) 2019-04-09 15:54:07,521-0400 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call StoragePool.connectStorageServer succeeded in 0.02 seconds (__init__:312) 2019-04-09 15:54:07,533-0400 INFO (jsonrpc/4) [vdsm.api] START getStorageDomainStats(sdUUID=u'd99fb087-66d5-4adf-9c0c-80e60de17917', options=None) from=::1,58130, task_id=47c23532-80c9-4186-948b-be9ed264bfbf (api:48) agent.log MainThread::INFO::2019-04-09 16:07:23,647::agent::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run) ovirt-hosted-engine-ha agent 2.3.1 started MainThread::INFO::2019-04-09 16:07:23,699::hosted_engine::244::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Found certificate common name: potential.o2pos.com.br MainThread::INFO::2019-04-09 16:07:23,825::hosted_engine::524::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker) Initializing ha-broker connection MainThread::INFO::2019-04-09 16:07:23,827::brokerlink::77::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Starting monitor ping, options {'addr': '192.168.8.1'} MainThread::ERROR::2019-04-09 16:07:23,828::hosted_engine::540::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker) Failed to start necessary monitors MainThread::ERROR::2019-04-09 16:07:23,828::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 131, in _run_agent return action(he) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 55, in action_proper return he.start_monitoring() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 413, in start_monitoring self._initialize_broker() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 537, in _initialize_broker m.get('options', {})) File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 86, in start_monitor ).format(t=type, o=options, e=e) RequestError: brokerlink - failed to start monitor via ovirt-ha-broker: [Errno 2] No such file or directory, [monitor: 'ping', options: {'addr': '192.168.8.1'}] MainThread::ERROR::2019-04-09 16:07:23,829::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Trying to restart agent MainThread::INFO::2019-04-09 16:07:23,829::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down brocker.log MainThread::INFO::2019-04-09 16:08:00,892::broker::47::ovirt_hosted_engine_ha.broker.broker.Broker::(run) ovirt-hosted-engine-ha broker 2.3.1 started MainThread::INFO::2019-04-09 16:08:00,892::monitor::40::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Searching for submonitors in /usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/submonitors MainThread::INFO::2019-04-09 16:08:00,893::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load MainThread::INFO::2019-04-09 16:08:00,895::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine MainThread::INFO::2019-04-09 16:08:00,895::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health MainThread::INFO::2019-04-09 16:08:00,896::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free MainThread::INFO::2019-04-09 16:08:00,896::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-load MainThread::INFO::2019-04-09 16:08:00,896::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge MainThread::INFO::2019-04-09 16:08:00,897::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor ping MainThread::INFO::2019-04-09 16:08:00,897::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain MainThread::INFO::2019-04-09 16:08:00,897::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load MainThread::INFO::2019-04-09 16:08:00,898::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor cpu-load-no-engine MainThread::INFO::2019-04-09 16:08:00,898::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor engine-health MainThread::INFO::2019-04-09 16:08:00,899::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-free MainThread::INFO::2019-04-09 16:08:00,899::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mem-load MainThread::INFO::2019-04-09 16:08:00,899::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor mgmt-bridge MainThread::INFO::2019-04-09 16:08:00,900::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor ping MainThread::INFO::2019-04-09 16:08:00,900::monitor::49::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Loaded submonitor storage-domain MainThread::INFO::2019-04-09 16:08:00,900::monitor::50::ovirt_hosted_engine_ha.broker.monitor.Monitor::(_discover_submonitors) Finished loading submonitors MainThread::INFO::2019-04-09 16:08:00,957::storage_backends::345::ovirt_hosted_engine_ha.lib.storage_backends::(connect) Connecting the storage MainThread::INFO::2019-04-09 16:08:00,958::storage_server::349::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::INFO::2019-04-09 16:08:00,993::storage_server::356::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::INFO::2019-04-09 16:08:01,025::storage_server::413::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Refreshing the storage domain MainThread::WARNING::2019-04-09 16:08:01,322::storage_broker::97::ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker::(__init__) Can't connect vdsm storage: Command Image.prepare with args {'imageID': 'e525f96e-ffa3-43a8-a368-d473f064944a', 'storagepoolID': '00000000-0000-0000-0000-000000000000', 'volumeID': '12c2075c-4796-4185-b7f3-ed9f366d95ef', 'storagedomainID': 'd99fb087-66d5-4adf-9c0c-80e60de17917'} failed: (code=100, message='VERSION')

2 2