On Thu, Aug 18, 2016 at 12:11 PM, Carlos Rodrigues <cmar(a)eurotux.com> wrote:
On Thu, 2016-08-18 at 11:53 +0200, Simone Tiraboschi wrote:
>
>
> On Thu, Aug 18, 2016 at 11:50 AM, Carlos Rodrigues <cmar(a)eurotux.com>
> wrote:
> > On Thu, 2016-08-18 at 11:42 +0200, Simone Tiraboschi wrote:
> > > On Thu, Aug 18, 2016 at 11:25 AM, Carlos Rodrigues <cmar@eurotux.
> > > com> wrote:
> > > >
> > > > On Thu, 2016-08-18 at 11:04 +0200, Simone Tiraboschi wrote:
> > > > >
> > > > > On Thu, Aug 18, 2016 at 10:36 AM, Carlos Rodrigues <cmar@euro
> > > > > tux.com>
> > > > > wrote:
> > > > > >
> > > > > >
> > > > > > On Thu, 2016-08-18 at 10:27 +0200, Simone Tiraboschi wrote:
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Aug 18, 2016 at 10:22 AM, Carlos Rodrigues
<cmar@
> > > > > > > eurotux.
> > > > > > > com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, 2016-08-18 at 08:54 +0200, Simone
Tiraboschi
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, Aug 16, 2016 at 12:53 PM, Carlos
Rodrigues <c
> > > > > > > > > mar@euro
> > > > > > > > > tux.
> > > > > > > > > com>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Sun, 2016-08-14 at 14:22 +0300, Roy
Golan wrote:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On 12 August 2016 at 20:23, Carlos
Rodrigues <cma
> > > > > > > > > > > r@eurotu
> > > > > > > > > > > x.co
> > > > > > > > > > > m>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Hello,
> > > > > > > > > > > >
> > > > > > > > > > > > I have one cluster with two
hosts with power
> > > > > > > > > > > > management
> > > > > > > > > > > > correctly
> > > > > > > > > > > > configured and one virtual
machine with
> > > > > > > > > > > > HostedEngine
> > > > > > > > > > > > over
> > > > > > > > > > > > shared
> > > > > > > > > > > > storage with FiberChannel.
> > > > > > > > > > > >
> > > > > > > > > > > > When i shutdown the network of
host with
> > > > > > > > > > > > HostedEngine
> > > > > > > > > > > > VM, it
> > > > > > > > > > > > should be
> > > > > > > > > > > > possible the HostedEngine VM
migrate
> > > > > > > > > > > > automatically to
> > > > > > > > > > > > another
> > > > > > > > > > > > host?
> > > > > > > > > > > >
> > > > > > > > > > > migrate on which network?
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > What is the expected behaviour
on this HA
> > > > > > > > > > > > scenario?
> > > > > > > > > > >
> > > > > > > > > > > After a few minutes your vm will be
shutdown by
> > > > > > > > > > > the High
> > > > > > > > > > > Availability
> > > > > > > > > > > agent, as it can't see network,
and started on
> > > > > > > > > > > another
> > > > > > > > > > > host.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I'm testing this scenario and after
shutdown
> > > > > > > > > > network, it
> > > > > > > > > > should
> > > > > > > > > > be
> > > > > > > > > > expected that agent shutdown ha and
started on
> > > > > > > > > > another
> > > > > > > > > > host,
> > > > > > > > > > but
> > > > > > > > > > after
> > > > > > > > > > couple minutes nothing happens and on
host with
> > > > > > > > > > network we
> > > > > > > > > > getting
> > > > > > > > > > the
> > > > > > > > > > following messages:
> > > > > > > > > >
> > > > > > > > > > Aug 16 11:44:08
ied-blade11.install.eurotux.local
> > > > > > > > > > ovirt-ha-
> > > > > > > > > > agent[2779]:
> > > > > > > > > > ovirt-ha-agent
> > > > > > > > > >
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEn
> > > > > > > > > > gine.con
> > > > > > > > > > fig
> > > > > > > > > > ERROR
> > > > > > > > > > Unable to get vm.conf from OVF_STORE,
falling back
> > > > > > > > > > to
> > > > > > > > > > initial
> > > > > > > > > > vm.conf
> > > > > > > > > >
> > > > > > > > > > I think the HA agent its trying to get
vm
> > > > > > > > > > configuration but
> > > > > > > > > > some
> > > > > > > > > > how it
> > > > > > > > > > can't get vm.conf to start VM.
> > > > > > > > >
> > > > > > > > > No, this is a different issues.
> > > > > > > > > In 3.6 we added a feature to let the engine
manage
> > > > > > > > > also the
> > > > > > > > > engine VM
> > > > > > > > > itself; ovirt-ha-agent will pickup the latest
engine
> > > > > > > > > VM
> > > > > > > > > configuration
> > > > > > > > > from the OVF_STORE which is managed by the
engine.
> > > > > > > > > If something goes wrong, ovirt-ha-agent
could
> > > > > > > > > fallback to the
> > > > > > > > > initial
> > > > > > > > > (bootstrap time) vm.conf. This will normally
happen
> > > > > > > > > till you
> > > > > > > > > add
> > > > > > > > > your
> > > > > > > > > first regular storage domain and the engine
imports
> > > > > > > > > the
> > > > > > > > > engine
> > > > > > > > > VM.
> > > > > > > >
> > > > > > > > But i already have my first storage domain and
storage
> > > > > > > > engine
> > > > > > > > domain
> > > > > > > > and already imported engine VM.
> > > > > > > >
> > > > > > > > I'm using 4.0 version.
> > > > > > >
> > > > > > > This seams an issue, can you please share your
> > > > > > > /var/log/ovirt-hosted-engine-ha/agent.log ?
> > > > > > >
> > > > > >
> > > > > > I sent it in attachment.
> > > > >
> > > > > Nothing strange here;
> > > > > do you see a couple of disks with alias OVF_STORE on the
> > > > > hosted-
> > > > > engine
> > > > > storage domain if you check it from the engine?
> > > > >
> > > >
> > > > Do you mean any disk label?
> > > > I don't have it anyone:
> > > >
> > > > [root@ied-blade11 ~]# ls /dev/disk/by-label/
> > > > ls: cannot access /dev/disk/by-label/: No such file or
> > > > directory
> > >
> > > No I mean: go to the engine web-ui, select the hosted-engine
> > > storage
> > > domain, check the disks there.
> >
> > No, the alias is virtio-disk0.
> >
>
> And this is the engine VM disk, so the issue is why the engine has
> still to create the OVF_STORE.
> Can you please share your engine.log from the engine VM?
>
Go in attachment.
The creation of the OVF_STORE disk failed but it's not that clear why:
2016-08-17 08:43:33,538 ERROR
[org.ovirt.engine.core.bll.storage.ovfstore.CreateOvfVolumeForStorageDomainCommand]
(DefaultQuartzScheduler6) [6f1f1fd4] Ending command
'org.ovirt.engine.core.bll.storage.ovfstore.CreateOvfVolumeForStorageDomainCommand'
with failure.
2016-08-17 08:43:33,540 ERROR
[org.ovirt.engine.core.bll.storage.disk.AddDiskCommand]
(DefaultQuartzScheduler6) [6f1f1fd4] Ending command
'org.ovirt.engine.core.bll.storage.disk.AddDiskCommand' with failure.
2016-08-17 08:43:33,541 WARN
[org.ovirt.engine.core.bll.storage.disk.AddDiskCommand]
(DefaultQuartzScheduler6) [6f1f1fd4] VmCommand::EndVmCommand: Vm is
null - not performing endAction on Vm
2016-08-17 08:43:33,553 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler6) [6f1f1fd4] Correlation ID: 6f1f1fd4, Call
Stack: null, Custom Event ID: -1, Message: Add-Disk operation failed
to complete.
2016-08-17 08:43:33,557 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler6) [] Correlation ID: 19ac5bda, Call Stack:
null, Custom Event ID: -1, Message: Failed to create OVF store disk
for Storage Domain hosted_storage.
OVF data won't be updated meanwhile for that domain.
2016-08-17 08:43:33,585 INFO
[org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
(DefaultQuartzScheduler6) [5f5a8daf] Command
'ProcessOvfUpdateForStorageDomain' (id:
'71aaaafe-7b9e-45e8-a40c-6d33bdf646a0') waiting on child command id:
'eb2e6f1a-c756-4ccd-85a1-60d97d6880de'
type:'CreateOvfVolumeForStorageDomain' to complete
2016-08-17 08:43:33,595 ERROR
[org.ovirt.engine.core.bll.storage.ovfstore.CreateOvfVolumeForStorageDomainCommand]
(DefaultQuartzScheduler6) [5d314e49] Ending command
'org.ovirt.engine.core.bll.storage.ovfstore.CreateOvfVolumeForStorageDomainCommand'
with failure.
2016-08-17 08:43:33,596 ERROR
[org.ovirt.engine.core.bll.storage.disk.AddDiskCommand]
(DefaultQuartzScheduler6) [5d314e49] Ending command
'org.ovirt.engine.core.bll.storage.disk.AddDiskCommand' with failure.
2016-08-17 08:43:33,596 WARN
[org.ovirt.engine.core.bll.storage.disk.AddDiskCommand]
(DefaultQuartzScheduler6) [5d314e49] VmCommand::EndVmCommand: Vm is
null - not performing endAction on Vm
2016-08-17 08:43:33,602 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler6) [5d314e49] Correlation ID: 5d314e49, Call
Stack: null, Custom Event ID: -1, Message: Add-Disk operation failed
to complete.
2016-08-17 08:43:33,605 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler6) [] Correlation ID: 5f5a8daf, Call Stack:
null, Custom Event ID: -1, Message: Failed to create OVF store disk
for Storage Domain hosted_storage.
OVF data won't be updated meanwhile for that domain.
2016-08-17 08:43:36,460 INFO
[org.ovirt.engine.core.bll.scheduling.HaReservationHandling]
(DefaultQuartzScheduler7) [5d314e49] HA reservation status for cluster
'Default' is 'OK'
2016-08-17 08:43:36,662 INFO
[org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
(DefaultQuartzScheduler4) [5f5a8daf] Command
'ProcessOvfUpdateForStorageDomain' id:
'71aaaafe-7b9e-45e8-a40c-6d33bdf646a0' child commands
'[84959a4b-6a10-4d22-b37e-6c154e17a0da,
eb2e6f1a-c756-4ccd-85a1-60d97d6880de]' executions were completed,
status 'FAILED'
2016-08-17 08:43:37,691 ERROR
[org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand]
(DefaultQuartzScheduler6) [5f5a8daf] Ending command
'org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand'
with failure.
Can you please check vdsm logs for that time frame on the SPM host?
It seams that you also have an issue in the SPM election procedure:
2016-08-17 18:04:31,053 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData]
(DefaultQuartzScheduler1) [] SPM Init: could not find reported vds or
not up - pool: 'Default' vds_spm_id: '2'
2016-08-17 18:04:31,076 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData]
(DefaultQuartzScheduler1) [] SPM selection - vds seems as spm
'hosted_engine_2'
2016-08-17 18:04:31,076 WARN
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData]
(DefaultQuartzScheduler1) [] spm vds is non responsive, stopping spm
selection.
2016-08-17 18:04:31,539 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.VmsStatisticsFetcher]
(DefaultQuartzScheduler7) [] Fetched 1 VMs from VDS
'06372186-572c-41ad-916f-7cbb0aba5302'
probably due to:
2016-08-17 18:02:33,569 ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(DefaultQuartzScheduler6) [] Failure to refresh Vds runtime info:
VDSGenericException: VDSNetworkException: Message timeout which can be
caused by communication issues
2016-08-17 18:02:33,569 ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(DefaultQuartzScheduler6) [] Exception:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
VDSGenericException: VDSNetworkException: Message timeout which can be
caused by communication issues
can you please check if the engine VM could correctly resolve and
reach each host?
> > > > > > > > > > Regards,
> > > > > > > > > > Carlos Rodrigues
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Regards,
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Carlos Rodrigues
> > > > > > > > > > > >
> > > > > > > > > > > > Engenheiro de Software Sénior
> > > > > > > > > > > >
> > > > > > > > > > > > Eurotux Informática, S.A. |
www.eurotux.com
> > > > > > > > > > > > (t) +351 253 680 300 (m) +351
911 926 110
> > > > > > > > > > > >
> > > > > > > > > > > >
_______________________________________________
> > > > > > > > > > > > Users mailing list
> > > > > > > > > > > > Users(a)ovirt.org
> > > > > > > > > > > >
http://lists.ovirt.org/mailman/listinfo/users
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Carlos Rodrigues
> > > > > > > > > >
> > > > > > > > > > Engenheiro de Software Sénior
> > > > > > > > > >
> > > > > > > > > > Eurotux Informática, S.A. |
www.eurotux.com
> > > > > > > > > > (t) +351 253 680 300 (m) +351 911 926
110
> > > > > > > > > >
> > > > > > > > > >
_______________________________________________
> > > > > > > > > > Users mailing list
> > > > > > > > > > Users(a)ovirt.org
> > > > > > > > > >
http://lists.ovirt.org/mailman/listinfo/users
> > > > > > > > --
> > > > > > > > Carlos Rodrigues
> > > > > > > >
> > > > > > > > Engenheiro de Software Sénior
> > > > > > > >
> > > > > > > > Eurotux Informática, S.A. |
www.eurotux.com
> > > > > > > > (t) +351 253 680 300 (m) +351 911 926 110
> > > > > > > >
> > > > > > --
> > > > > > Carlos Rodrigues
> > > > > >
> > > > > > Engenheiro de Software Sénior
> > > > > >
> > > > > > Eurotux Informática, S.A. |
www.eurotux.com
> > > > > > (t) +351 253 680 300 (m) +351 911 926 110
> > > > --
> > > > Carlos Rodrigues
> > > >
> > > > Engenheiro de Software Sénior
> > > >
> > > > Eurotux Informática, S.A. |
www.eurotux.com
> > > > (t) +351 253 680 300 (m) +351 911 926 110
> > > >
> > --
> > Carlos Rodrigues
> >
> > Engenheiro de Software Sénior
> >
> > Eurotux Informática, S.A. |
www.eurotux.com
> > (t) +351 253 680 300 (m) +351 911 926 110
> >
> >
>
--
Carlos Rodrigues
Engenheiro de Software Sénior
Eurotux Informática, S.A. |
www.eurotux.com
(t) +351 253 680 300 (m) +351 911 926 110