
On Thu, 2016-08-18 at 12:34 +0200, Simone Tiraboschi wrote:
On Thu, Aug 18, 2016 at 12:11 PM, Carlos Rodrigues <cmar@eurotux.com> wrote:
On Thu, 2016-08-18 at 11:53 +0200, Simone Tiraboschi wrote:
On Thu, Aug 18, 2016 at 11:50 AM, Carlos Rodrigues <cmar@eurotux. com> wrote:
On Thu, 2016-08-18 at 11:42 +0200, Simone Tiraboschi wrote:
On Thu, Aug 18, 2016 at 11:25 AM, Carlos Rodrigues <cmar@euro tux. com> wrote:
On Thu, 2016-08-18 at 11:04 +0200, Simone Tiraboschi wrote: > > > On Thu, Aug 18, 2016 at 10:36 AM, Carlos Rodrigues <cmar@ > euro > tux.com> > wrote: > > > > > > > > On Thu, 2016-08-18 at 10:27 +0200, Simone Tiraboschi > > wrote: > > > > > > > > > > > > On Thu, Aug 18, 2016 at 10:22 AM, Carlos Rodrigues > > > <cmar@ > > > eurotux. > > > com> > > > wrote: > > > > > > > > > > > > > > > > > > > > On Thu, 2016-08-18 at 08:54 +0200, Simone > > > > Tiraboschi > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 16, 2016 at 12:53 PM, Carlos > > > > > Rodrigues <c > > > > > mar@euro > > > > > tux. > > > > > com> > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, 2016-08-14 at 14:22 +0300, Roy Golan > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 12 August 2016 at 20:23, Carlos Rodrigues > > > > > > > <cma > > > > > > > r@eurotu > > > > > > > x.co > > > > > > > m> > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > I have one cluster with two hosts with > > > > > > > > power > > > > > > > > management > > > > > > > > correctly > > > > > > > > configured and one virtual machine with > > > > > > > > HostedEngine > > > > > > > > over > > > > > > > > shared > > > > > > > > storage with FiberChannel. > > > > > > > > > > > > > > > > When i shutdown the network of host with > > > > > > > > HostedEngine > > > > > > > > VM, it > > > > > > > > should be > > > > > > > > possible the HostedEngine VM migrate > > > > > > > > automatically to > > > > > > > > another > > > > > > > > host? > > > > > > > > > > > > > > > migrate on which network? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What is the expected behaviour on this HA > > > > > > > > scenario? > > > > > > > > > > > > > > After a few minutes your vm will be shutdown > > > > > > > by > > > > > > > the High > > > > > > > Availability > > > > > > > agent, as it can't see network, and started > > > > > > > on > > > > > > > another > > > > > > > host. > > > > > > > > > > > > > > > > > > I'm testing this scenario and after shutdown > > > > > > network, it > > > > > > should > > > > > > be > > > > > > expected that agent shutdown ha and started on > > > > > > another > > > > > > host, > > > > > > but > > > > > > after > > > > > > couple minutes nothing happens and on host with > > > > > > network we > > > > > > getting > > > > > > the > > > > > > following messages: > > > > > > > > > > > > Aug 16 11:44:08 ied- > > > > > > blade11.install.eurotux.local > > > > > > ovirt-ha- > > > > > > agent[2779]: > > > > > > ovirt-ha-agent > > > > > > ovirt_hosted_engine_ha.agent.hosted_engine.Host > > > > > > edEn > > > > > > gine.con > > > > > > fig > > > > > > ERROR > > > > > > Unable to get vm.conf from OVF_STORE, falling > > > > > > back > > > > > > to > > > > > > initial > > > > > > vm.conf > > > > > > > > > > > > I think the HA agent its trying to get vm > > > > > > configuration but > > > > > > some > > > > > > how it > > > > > > can't get vm.conf to start VM. > > > > > > > > > > No, this is a different issues. > > > > > In 3.6 we added a feature to let the engine > > > > > manage > > > > > also the > > > > > engine VM > > > > > itself; ovirt-ha-agent will pickup the latest > > > > > engine > > > > > VM > > > > > configuration > > > > > from the OVF_STORE which is managed by the > > > > > engine. > > > > > If something goes wrong, ovirt-ha-agent could > > > > > fallback to the > > > > > initial > > > > > (bootstrap time) vm.conf. This will normally > > > > > happen > > > > > till you > > > > > add > > > > > your > > > > > first regular storage domain and the engine > > > > > imports > > > > > the > > > > > engine > > > > > VM. > > > > > > > > But i already have my first storage domain and > > > > storage > > > > engine > > > > domain > > > > and already imported engine VM. > > > > > > > > I'm using 4.0 version. > > > > > > This seams an issue, can you please share your > > > /var/log/ovirt-hosted-engine-ha/agent.log ? > > > > > > > I sent it in attachment. > > Nothing strange here; > do you see a couple of disks with alias OVF_STORE on the > hosted- > engine > storage domain if you check it from the engine? >
Do you mean any disk label? I don't have it anyone:
[root@ied-blade11 ~]# ls /dev/disk/by-label/ ls: cannot access /dev/disk/by-label/: No such file or directory
No I mean: go to the engine web-ui, select the hosted-engine storage domain, check the disks there.
No, the alias is virtio-disk0.
And this is the engine VM disk, so the issue is why the engine has still to create the OVF_STORE. Can you please share your engine.log from the engine VM?
Go in attachment.
The creation of the OVF_STORE disk failed but it's not that clear why:
2016-08-17 08:43:33,538 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.CreateOvfVolumeForStorage DomainCommand] (DefaultQuartzScheduler6) [6f1f1fd4] Ending command 'org.ovirt.engine.core.bll.storage.ovfstore.CreateOvfVolumeForStorage DomainCommand' with failure. 2016-08-17 08:43:33,540 ERROR [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (DefaultQuartzScheduler6) [6f1f1fd4] Ending command 'org.ovirt.engine.core.bll.storage.disk.AddDiskCommand' with failure. 2016-08-17 08:43:33,541 WARN [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (DefaultQuartzScheduler6) [6f1f1fd4] VmCommand::EndVmCommand: Vm is null - not performing endAction on Vm 2016-08-17 08:43:33,553 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector ] (DefaultQuartzScheduler6) [6f1f1fd4] Correlation ID: 6f1f1fd4, Call Stack: null, Custom Event ID: -1, Message: Add-Disk operation failed to complete. 2016-08-17 08:43:33,557 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector ] (DefaultQuartzScheduler6) [] Correlation ID: 19ac5bda, Call Stack: null, Custom Event ID: -1, Message: Failed to create OVF store disk for Storage Domain hosted_storage. OVF data won't be updated meanwhile for that domain. 2016-08-17 08:43:33,585 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler6) [5f5a8daf] Command 'ProcessOvfUpdateForStorageDomain' (id: '71aaaafe-7b9e-45e8-a40c-6d33bdf646a0') waiting on child command id: 'eb2e6f1a-c756-4ccd-85a1-60d97d6880de' type:'CreateOvfVolumeForStorageDomain' to complete 2016-08-17 08:43:33,595 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.CreateOvfVolumeForStorage DomainCommand] (DefaultQuartzScheduler6) [5d314e49] Ending command 'org.ovirt.engine.core.bll.storage.ovfstore.CreateOvfVolumeForStorage DomainCommand' with failure. 2016-08-17 08:43:33,596 ERROR [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (DefaultQuartzScheduler6) [5d314e49] Ending command 'org.ovirt.engine.core.bll.storage.disk.AddDiskCommand' with failure. 2016-08-17 08:43:33,596 WARN [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (DefaultQuartzScheduler6) [5d314e49] VmCommand::EndVmCommand: Vm is null - not performing endAction on Vm 2016-08-17 08:43:33,602 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector ] (DefaultQuartzScheduler6) [5d314e49] Correlation ID: 5d314e49, Call Stack: null, Custom Event ID: -1, Message: Add-Disk operation failed to complete. 2016-08-17 08:43:33,605 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector ] (DefaultQuartzScheduler6) [] Correlation ID: 5f5a8daf, Call Stack: null, Custom Event ID: -1, Message: Failed to create OVF store disk for Storage Domain hosted_storage. OVF data won't be updated meanwhile for that domain. 2016-08-17 08:43:36,460 INFO [org.ovirt.engine.core.bll.scheduling.HaReservationHandling] (DefaultQuartzScheduler7) [5d314e49] HA reservation status for cluster 'Default' is 'OK' 2016-08-17 08:43:36,662 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler4) [5f5a8daf] Command 'ProcessOvfUpdateForStorageDomain' id: '71aaaafe-7b9e-45e8-a40c-6d33bdf646a0' child commands '[84959a4b-6a10-4d22-b37e-6c154e17a0da, eb2e6f1a-c756-4ccd-85a1-60d97d6880de]' executions were completed, status 'FAILED' 2016-08-17 08:43:37,691 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorag eDomainCommand] (DefaultQuartzScheduler6) [5f5a8daf] Ending command 'org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorag eDomainCommand' with failure.
Can you please check vdsm logs for that time frame on the SPM host?
I sent in attachment the vdsm logs from both hosts, but i think the SPM host on this time frame it was ied-blade13
It seams that you also have an issue in the SPM election procedure:
2016-08-17 18:04:31,053 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (DefaultQuartzScheduler1) [] SPM Init: could not find reported vds or not up - pool: 'Default' vds_spm_id: '2' 2016-08-17 18:04:31,076 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (DefaultQuartzScheduler1) [] SPM selection - vds seems as spm 'hosted_engine_2' 2016-08-17 18:04:31,076 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] (DefaultQuartzScheduler1) [] spm vds is non responsive, stopping spm selection. 2016-08-17 18:04:31,539 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmsStatisticsFetcher] (DefaultQuartzScheduler7) [] Fetched 1 VMs from VDS '06372186-572c-41ad-916f-7cbb0aba5302'
probably due to: 2016-08-17 18:02:33,569 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (DefaultQuartzScheduler6) [] Failure to refresh Vds runtime info: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues 2016-08-17 18:02:33,569 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (DefaultQuartzScheduler6) [] Exception: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues
This messages maybe cause by connection issues from yesterday at 6pm.
can you please check if the engine VM could correctly resolve and reach each host?
Now i can read engine VM from both hosts [root@ied-blade11 ~]# ping ied-hosted-engine PING ied-hosted-engine.install.eurotux.local (10.10.4.115) 56(84) bytes of data. 64 bytes from ied-hosted-engine.install.eurotux.local (10.10.4.115): icmp_seq=1 ttl=64 time=0.179 ms 64 bytes from ied-hosted-engine.install.eurotux.local (10.10.4.115): icmp_seq=2 ttl=64 time=0.141 ms ^C --- ied-hosted-engine.install.eurotux.local ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 999ms rtt min/avg/max/mdev = 0.141/0.160/0.179/0.019 ms [root@ied-blade13 ~]# ping ied-hosted-engine PING ied-hosted-engine.install.eurotux.local (10.10.4.115) 56(84) bytes of data. 64 bytes from ied-hosted-engine.install.eurotux.local (10.10.4.115): icmp_seq=1 ttl=64 time=0.172 ms 64 bytes from ied-hosted-engine.install.eurotux.local (10.10.4.115): icmp_seq=2 ttl=64 time=0.169 ms ^C --- ied-hosted-engine.install.eurotux.local ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 0.169/0.170/0.172/0.013 ms I have a message critical of low disk space on hosted_storage domain. I have 50G of disk and i created i VM with 40G. Do i need more space of OVF_STORAGE? What is the minimum requirements of disk space for deploy engine VM? Regards, Carlos
> > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > Carlos Rodrigues > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > -- > > > > > > > > Carlos Rodrigues > > > > > > > > > > > > > > > > Engenheiro de Software Sénior > > > > > > > > > > > > > > > > Eurotux Informática, S.A. | www.eurotux.com > > > > > > > > (t) +351 253 680 300 (m) +351 911 926 110 > > > > > > > > > > > > > > > > ___________________________________________ > > > > > > > > ____ > > > > > > > > Users mailing list > > > > > > > > Users@ovirt.org > > > > > > > > http://lists.ovirt.org/mailman/listinfo/use > > > > > > > > rs > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Carlos Rodrigues > > > > > > > > > > > > Engenheiro de Software Sénior > > > > > > > > > > > > Eurotux Informática, S.A. | www.eurotux.com > > > > > > (t) +351 253 680 300 (m) +351 911 926 110 > > > > > > > > > > > > _______________________________________________ > > > > > > Users mailing list > > > > > > Users@ovirt.org > > > > > > http://lists.ovirt.org/mailman/listinfo/users > > > > -- > > > > Carlos Rodrigues > > > > > > > > Engenheiro de Software Sénior > > > > > > > > Eurotux Informática, S.A. | www.eurotux.com > > > > (t) +351 253 680 300 (m) +351 911 926 110 > > > > > > -- > > Carlos Rodrigues > > > > Engenheiro de Software Sénior > > > > Eurotux Informática, S.A. | www.eurotux.com > > (t) +351 253 680 300 (m) +351 911 926 110 -- Carlos Rodrigues
Engenheiro de Software Sénior
Eurotux Informática, S.A. | www.eurotux.com (t) +351 253 680 300 (m) +351 911 926 110
-- Carlos Rodrigues
Engenheiro de Software Sénior
Eurotux Informática, S.A. | www.eurotux.com (t) +351 253 680 300 (m) +351 911 926 110
-- Carlos Rodrigues
Engenheiro de Software Sénior
Eurotux Informática, S.A. | www.eurotux.com (t) +351 253 680 300 (m) +351 911 926 110
-- Carlos Rodrigues Engenheiro de Software Sénior Eurotux Informática, S.A. | www.eurotux.com (t) +351 253 680 300 (m) +351 911 926 110