El 04/08/16 a las 15:25, Arik Hadas escribió:
----- Original Message -----
> El 2016-08-04 08:24, Arik Hadas escribió:
>> ----- Original Message -----
>>>
>>> El 04/08/16 a las 07:18, Arik Hadas escribió:
>>>> ----- Original Message -----
>>>>> Hi,
>>>>>
>>>>> We're running oVirt 4.0.1 and today I found out that one of our
hosts
>>>>> has all its VMs in an unknown state. I actually don't know how
(and
>>>>> when) did this happen, but I'd like to restore service possibly
without
>>>>> turning off these machines. The host is up, the VMs are up,
'qemu'
>>>>> process exists, no errors, it's just the VMs running on it that
have a
>>>>> '?' where status is defined.
>>>>>
>>>>> Is it safe in this case to simply modify database and set those
VM's
>>>>> status to 'up'? I remember having to do this a time ago when
we faced
>>>>> storage issues, it didn't break anything back then. If not, is
there a
>>>>> "safe" way to migrate those VMs to a different host and
restart the
>>>>> host
>>>>> that marked them as unknown?
>>>> Hi Nicolás,
>>>>
>>>> I assume that the host these VMs are running on is empty in the
>>>> webadmin,
>>>> right? if that is the case then you've probably hit [1]. Changing
their
>>>> status to up is not the way to go since these VMs will not be monitored.
>>> Hi Arik,
>>>
>>> By "empty" you mean the webadmin reports the host being running 0
VMs?
>>> If so, that's not the case, actually the VM count seems to be correct
>>> in
>>> relation to "qemu-*" processes (about 32 VMs), I can even see the
>>> machines in the "Virtual machines" tab of the host, it's just
they are
>>> all marked with the '?' mark.
>> No, I meant the 'Host' column in the Virtual Machines tab but if you
>> see
>> the VMs in the "Virtual machines" sub-tab of the host then run_on_vds
>> points to the right host..
>>
>> The host is up in the webadmin as well?
>> Can you share the engine log?
>>
> Yes, the host is up in the webadmin, there are no issues with it, just
> the VMs running on it have the '?' mark. I've made 3 tests:
>
> 1) Restart engine: did not help
> 2) Check firewall, seems to be ok.
> 2) PostgreSQL: UPDATE vm_dynamic SET status = 1 WHERE status = 8; :
> After a while, I see lots of entries like this:
>
> 2016-08-04 09:23:10,910 WARN
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (DefaultQuartzScheduler4) [6ad135b8] Correlation ID: null, Call Stack:
> null, Custom Event ID: -1, Message: VM xxx is not responding.
>
> I'm attaching the engine log, but I don't know when did this happen for
> the first time, though. If there's a manual way/command to migrate VMs
> to a different host I'd appreciate a hint about it.
>
> Is it safe to restart vdsmd on this host?
The engine log looks fine - the VMs are reported as not-responding for
some reason. I would restart libvirtd and vdsmd then
Is restarting those two daemons safe? I mean, will that stop all qemu-*
processes, so the VMs marked as unknown will stop?
> Thanks.
>
>>> Thanks.
>>>
>>>> Yes, there is no other way to resolve it other than changing the DB but
>>>> the change should be to update run_on_vds field of these VMs to the host
>>>> you know they are running on. Their status will then be updates in 15
>>>> sec.
>>>>
>>>> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1354494
>>>>
>>>> Arik.
>>>>
>>>>> Thanks.
>>>>>
>>>>> Nicolás
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users(a)ovirt.org
>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>