On Thu, Aug 4, 2016 at 6:10 PM, Nicolás <nicolas(a)devels.es> wrote:
El 04/08/16 a las 15:25, Arik Hadas escribió:
>
> ----- Original Message -----
>
>> El 2016-08-04 08:24, Arik Hadas escribió:
>>
>>> ----- Original Message -----
>>>
>>>>
>>>> El 04/08/16 a las 07:18, Arik Hadas escribió:
>>>>
>>>>> ----- Original Message -----
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> We're running oVirt 4.0.1 and today I found out that one of
our hosts
>>>>>> has all its VMs in an unknown state. I actually don't know
how (and
>>>>>> when) did this happen, but I'd like to restore service
possibly
>>>>>> without
>>>>>> turning off these machines. The host is up, the VMs are up,
'qemu'
>>>>>> process exists, no errors, it's just the VMs running on it
that have
>>>>>> a
>>>>>> '?' where status is defined.
>>>>>>
>>>>>> Is it safe in this case to simply modify database and set those
VM's
>>>>>> status to 'up'? I remember having to do this a time ago
when we faced
>>>>>> storage issues, it didn't break anything back then. If not,
is there
>>>>>> a
>>>>>> "safe" way to migrate those VMs to a different host and
restart the
>>>>>> host
>>>>>> that marked them as unknown?
>>>>>>
>>>>> Hi Nicolás,
>>>>>
>>>>> I assume that the host these VMs are running on is empty in the
>>>>> webadmin,
>>>>> right? if that is the case then you've probably hit [1].
Changing
>>>>> their
>>>>> status to up is not the way to go since these VMs will not be
>>>>> monitored.
>>>>>
>>>> Hi Arik,
>>>>
>>>> By "empty" you mean the webadmin reports the host being running
0 VMs?
>>>> If so, that's not the case, actually the VM count seems to be
correct
>>>> in
>>>> relation to "qemu-*" processes (about 32 VMs), I can even see
the
>>>> machines in the "Virtual machines" tab of the host, it's
just they are
>>>> all marked with the '?' mark.
>>>>
>>> No, I meant the 'Host' column in the Virtual Machines tab but if you
>>> see
>>> the VMs in the "Virtual machines" sub-tab of the host then
run_on_vds
>>> points to the right host..
>>>
>>> The host is up in the webadmin as well?
>>> Can you share the engine log?
>>>
>>> Yes, the host is up in the webadmin, there are no issues with it, just
>> the VMs running on it have the '?' mark. I've made 3 tests:
>>
>> 1) Restart engine: did not help
>> 2) Check firewall, seems to be ok.
>> 2) PostgreSQL: UPDATE vm_dynamic SET status = 1 WHERE status = 8; :
>> After a while, I see lots of entries like this:
>>
>> 2016-08-04 09:23:10,910 WARN
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (DefaultQuartzScheduler4) [6ad135b8] Correlation ID: null, Call Stack:
>> null, Custom Event ID: -1, Message: VM xxx is not responding.
>>
>> I'm attaching the engine log, but I don't know when did this happen for
>> the first time, though. If there's a manual way/command to migrate VMs
>> to a different host I'd appreciate a hint about it.
>>
>> Is it safe to restart vdsmd on this host?
>>
> The engine log looks fine - the VMs are reported as not-responding for
> some reason. I would restart libvirtd and vdsmd then
>
Is restarting those two daemons safe? I mean, will that stop all qemu-*
processes, so the VMs marked as unknown will stop?
Neither should touch the qemu process, but re-connect to it as they
restart.
Y.
Thanks.
>>
>> Thanks.
>>>>
>>>> Yes, there is no other way to resolve it other than changing the DB but
>>>>> the change should be to update run_on_vds field of these VMs to the
>>>>> host
>>>>> you know they are running on. Their status will then be updates in
15
>>>>> sec.
>>>>>
>>>>> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1354494
>>>>>
>>>>> Arik.
>>>>>
>>>>> Thanks.
>>>>>>
>>>>>> Nicolás
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users(a)ovirt.org
>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>
>>>>>>
>>>>
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users