Re: [ovirt-users] Migrate machines in unknown state?

30 Sep 2016

      Hi Yaniv,

Just a reminder, can you give us a pointer? Red Hat Support just asked us
to disable PM before restarting vdsm again.

Thanks & Best regards,

On Mon, Aug 22, 2016 at 10:57 PM, Ekin Meroğlu <ekin.meroglu@linuxera.com>
wrote:
...
Hi Yaniv,
On Sun, Aug 7, 2016 at 9:37 PM, Ekin Meroğlu <ekin.meroglu@linuxera.com>
...
wrote:
...
Hi,
Just a reminder, if you have power management configured, first turn
that off for the host - when you restart vdsmd with the power management
configured, engine finds it not responding and tries to fence (e.g. reboot)
the host.
That's not true - if it's a graceful restart, it should not happen.
Can you explain this a little more? Is there a mechanism to prevent
fencing on this scenario?
In two of our customers' production systems we've experienced this exact
behavior (i.e. engine fencing the host while restarting vdsm service
manually) for a number of times, and we were specifically advised by Red
Hat Support to turn off PM before restarting service. I'd like to to know
if we have a better / easier way to restart vdsm. 
btw, b
oth of the environments were RHEV-H based RHEV 3.5 clusters, and both we
were busy systems, so restarting vdsm service took quite a long time. I'm
guessing this might be a factor.
Regards,

...
...
...
Other than that, restarting vdsmd has been safe in my experience...
Regards,
On Thu, Aug 4, 2016 at 6:10 PM, Nicolás <nicolas@devels.es> wrote:
...
El 04/08/16 a las 15:25, Arik Hadas escribió:
...
----- Original Message -----
...
El 2016-08-04 08:24, Arik Hadas escribió:
> ----- Original Message -----
>
>>
>> El 04/08/16 a las 07:18, Arik Hadas escribió:
>>
>>> ----- Original Message -----
>>>
>>>> Hi,
>>>>
>>>> We're running oVirt 4.0.1 and today I found out that one of our
>>>> hosts
>>>> has all its VMs in an unknown state. I actually don't know how
>>>> (and
>>>> when) did this happen, but I'd like to restore service possibly
>>>> without
>>>> turning off these machines. The host is up, the VMs are up, 'qemu'
>>>> process exists, no errors, it's just the VMs running on it that
>>>> have a
>>>> '?' where status is defined.
>>>>
>>>> Is it safe in this case to simply modify database and set those
>>>> VM's
>>>> status to 'up'? I remember having to do this a time ago when we
>>>> faced
>>>> storage issues, it didn't break anything back then. If not, is
>>>> there a
>>>> "safe" way to migrate those VMs to a different host and restart
>>>> the
>>>> host
>>>> that marked them as unknown?
>>>>
>>> Hi Nicolás,
>>>
>>> I assume that the host these VMs are running on is empty in the
>>> webadmin,
>>> right? if that is the case then you've probably hit [1]. Changing
>>> their
>>> status to up is not the way to go since these VMs will not be
>>> monitored.
>>>
>> Hi Arik,
>>
>> By "empty" you mean the webadmin reports the host being running 0
>> VMs?
>> If so, that's not the case, actually the VM count seems to be
>> correct
>> in
>> relation to "qemu-*" processes (about 32 VMs), I can even see the
>> machines in the "Virtual machines" tab of the host, it's just they
>> are
>> all marked with the '?' mark.
>>
> No, I meant the 'Host' column in the Virtual Machines tab but if you
> see
> the VMs in the "Virtual machines" sub-tab of the host then run_on_vds
> points to the right host..
>
> The host is up in the webadmin as well?
> Can you share the engine log?
>
> Yes, the host is up in the webadmin, there are no issues with it,
just
the VMs running on it have the '?' mark. I've made 3 tests:
1) Restart engine: did not help
2) Check firewall, seems to be ok.
2) PostgreSQL: UPDATE vm_dynamic SET status = 1 WHERE status = 8; :
After a while, I see lots of entries like this:
2016-08-04 09:23:10,910 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLo
gDirector]
(DefaultQuartzScheduler4) [6ad135b8] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM xxx is not responding.
I'm attaching the engine log, but I don't know when did this happen
for
the first time, though. If there's a manual way/command to migrate VMs
to a different host I'd appreciate a hint about it.
Is it safe to restart vdsmd on this host?
The engine log looks fine - the VMs are reported as not-responding for
some reason. I would restart libvirtd and vdsmd then
Is restarting those two daemons safe? I mean, will that stop all qemu-*
processes, so the VMs marked as unknown will stop?
Thanks.
...
...
Thanks.
>>
>> Yes, there is no other way to resolve it other than changing the DB
>>> but
>>> the change should be to update run_on_vds field of these VMs to
>>> the host
>>> you know they are running on. Their status will then be updates in
>>> 15
>>> sec.
>>>
>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1354494
>>>
>>> Arik.
>>>
>>> Thanks.
>>>>
>>>> Nicolás
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users@ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>

Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
--
*Ekin Meroğlu** Red Hat Certified Architect*
linuxera Özgür Yazılım Çözüm ve Hizmetleri
*T* +90 (850) 22 LINUX | *GSM* +90 (532) 137 77 04
www.linuxera.com | bilgi@linuxera.com
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
--
*Ekin Meroğlu** Red Hat Certified Architect*
linuxera Özgür Yazılım Çözüm ve Hizmetleri
*T* +90 (850) 22 LINUX | *GSM* +90 (532) 137 77 04
www.linuxera.com | bilgi@linuxera.com
-- 
*Ekin Meroğlu** Red Hat Certified Architect*

linuxera Özgür Yazılım Çözüm ve Hizmetleri
*T* +90 (850) 22 LINUX | *GSM* +90 (532) 137 77 04
www.linuxera.com | bilgi@linuxera.com