[Users] Issues starting hosted engine VM
Leonid Natapov
lnatapov at redhat.com
Mon Jan 20 11:33:40 UTC 2014
Auto resume depends on domain monitoring (failed domain coming back up causes VMs to be unpaused).
VM wouldn't be resumed if the domain monitoring for this domain stopped for some reason.
I don't think we have some kind of error or event saying to user something like "vm has failed to resume automatically,please resume it manually".
----- Original Message -----
From: "Dafna Ron" <dron at redhat.com>
To: "Leonid Natapov" <lnatapov at redhat.com>
Cc: "Andrew Lau" <andrew at andrewklau.com>, "Yedidyah Bar David" <didi at redhat.com>, "users" <users at ovirt.org>
Sent: Monday, January 20, 2014 1:13:46 PM
Subject: Re: [Users] Issues starting hosted engine VM
interesting... :) so this is now configurable...
what happens if qemu fails to start the vm (this happens sometimes -
mostly on file type storage). do we have a re-try or a specific error
telling the use that the activation failed and manual intervention is
required?
On 01/20/2014 11:02 AM, Leonid Natapov wrote:
> All vms. Check this PRD: https://bugzilla.redhat.com/show_bug.cgi?id=723055
>
>
> ----- Original Message -----
> From: "Dafna Ron" <dron at redhat.com>
> To: "Leonid Natapov" <lnatapov at redhat.com>
> Cc: "Andrew Lau" <andrew at andrewklau.com>, "Yedidyah Bar David" <didi at redhat.com>, "users" <users at ovirt.org>
> Sent: Monday, January 20, 2014 12:44:46 PM
> Subject: Re: [Users] Issues starting hosted engine VM
>
> On 01/20/2014 10:38 AM, Leonid Natapov wrote:
>> 1.hosted-engine --vm-start should start engine vm. There was no problem with it when I tested it.
>> 2.hosted-engine --vm-start-paused was added for the case when something is wrong with engine vm and it can't start and requires user intervention. For example in case of kernel panic.
>> User can start engine vm in paused mode ,connect to it and try to fix the problem by booting in single user mode ,etc.
>> 3.When the connectivity to shared storage is lost engine vm becomes paused. VM should be automatically unpaused after connectivity resumes (we introduced this feature in 3.3) but in case of NFS it could take quite time.so may be we should add something like --vm-resume in order to resume
> Are we talking only on the hosted engine vm or all other vm's? if I have
> other vm's they will also stop, will they be auto started as well?
>> the engine vm manually.
>>
>> Thanks,
>> Leonid.
>>
>>
>> ----- Original Message -----
>> From: "Andrew Lau" <andrew at andrewklau.com>
>> To: dron at redhat.com
>> Cc: "Leonid Natapov" <lnatapov at redhat.com>, "Yedidyah Bar David" <didi at redhat.com>, "users" <users at ovirt.org>
>> Sent: Monday, January 20, 2014 12:28:15 PM
>> Subject: Re: [Users] Issues starting hosted engine VM
>>
>> It was paused due to the connection loss to the NFS server, I would assume
>> once the connection is restored it could attempt to restore it? But I can
>> try dig up the vdsm logs if you want, they would only be a few hours old
>>
>> I think having an option like --vm-resume would at least hide the reason of
>> having to dig into virsh and messing with authentication at the very least.
>>
>> On Mon, Jan 20, 2014 at 9:23 PM, Dafna Ron <dron at redhat.com> wrote:
>>
>>> the question is what was the vm paused on... this can be found in the qemu
>>> vm log.
>>> if the vm is paused it will not be auto started - so I am not sure what
>>> you expect to change? virsh requires authentication regardless to hosted
>>> engine :)
>>> Leonid, did you do any testing there?
>>>
>>>
>>> On 01/20/2014 10:13 AM, Andrew Lau wrote:
>>>
>>>> I have opened this BZ 1055461 anyway just in case
>>>>
>>>>
>>>> On Mon, Jan 20, 2014 at 8:33 PM, Andrew Lau <andrew at andrewklau.com<mailto:
>>>> andrew at andrewklau.com>> wrote:
>>>>
>>>> I was more interested in how the score process would be
>>>> calculated, the vm-status option considered the VM in a bad state.
>>>>
>>>> I left it for a few minutes and nothing seemed to have changed, I
>>>> think it relates to hosted engine as virsh requires
>>>> authentication. Should I still open a bz?
>>>>
>>>> Cheers,
>>>> Andrew.
>>>>
>>>> On Jan 20, 2014 7:48 PM, "Dafna Ron" <dron at redhat.com
>>>> <mailto:dron at redhat.com>> wrote:
>>>>
>>>> I am not sure this is a hosted engine question as much as a
>>>> qemu question.
>>>> qemu-kvm will not support auto start of vm's after EIO because
>>>> of remote possibility of corruption.
>>>>
>>>> On 01/20/2014 05:46 AM, Andrew Lau wrote:
>>>>
>>>> Hi,
>>>>
>>>> Quick question, in the scenario eg. the NFS server becomes
>>>> unreachable and the hosted-engine goes into a paused
>>>> state. Will other hosts attempt to bring it back up?
>>>> Should there be a command eg. hosted-engine --vm-resume ?
>>>>
>>>> When this happened, I manually forced it to resume using virsh
>>>>
>>>>
>>>> On Sun, Jan 19, 2014 at 7:21 PM, Yedidyah Bar David
>>>> <didi at redhat.com <mailto:didi at redhat.com>
>>>> <mailto:didi at redhat.com <mailto:didi at redhat.com>>> wrote:
>>>>
>>>> Thanks a lot for your efforts and the report!
>>>> -- Didi
>>>>
>>>> ------------------------------
>>>> ------------------------------------------
>>>>
>>>> *From: *"Andrew Lau" <andrew at andrewklau.com
>>>> <mailto:andrew at andrewklau.com>
>>>> <mailto:andrew at andrewklau.com
>>>>
>>>> <mailto:andrew at andrewklau.com>>>
>>>> *To: *"users" <users at ovirt.org
>>>> <mailto:users at ovirt.org> <mailto:users at ovirt.org
>>>>
>>>> <mailto:users at ovirt.org>>>
>>>> *Sent: *Saturday, January 18, 2014 3:20:22 PM
>>>> *Subject: *Re: [Users] Issues starting hosted
>>>> engine VM
>>>>
>>>>
>>>> I believe I found the issue and have reported it here
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1055059
>>>>
>>>> On Sat, Jan 18, 2014 at 11:33 PM, Andrew Lau
>>>> <andrew at andrewklau.com
>>>> <mailto:andrew at andrewklau.com>
>>>> <mailto:andrew at andrewklau.com
>>>>
>>>> <mailto:andrew at andrewklau.com>>> wrote:
>>>>
>>>> The interesting thing - trying it with the
>>>> paused option
>>>> vdsm seems to create the VM
>>>>
>>>> hosted-engine --vm-start-paused
>>>>
>>>> vdsm.log http://www.fpaste.org/69604/13900482/
>>>>
>>>> But I'm not sure how to then proceed to
>>>> "resume" it.
>>>>
>>>> On Sat, Jan 18, 2014 at 10:23 PM, Andrew Lau
>>>> <andrew at andrewklau.com
>>>> <mailto:andrew at andrewklau.com>
>>>> <mailto:andrew at andrewklau.com
>>>>
>>>> <mailto:andrew at andrewklau.com>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> With the great help from sbonazzo, I
>>>> managed to step
>>>> past the initial bug with the
>>>> hosted-engine-setup but
>>>> appear to have run into another show stopper.
>>>>
>>>> I ran through the install process
>>>> successfully up to
>>>> the stage where it completed and the
>>>> engine VM was to
>>>> be shutdown. (The engine has already been
>>>> installed on
>>>> the VM and the host has been connected to
>>>> the engine).
>>>>
>>>> The issue starts here that the host finds
>>>> itself not
>>>> able to start the VM up again.
>>>>
>>>> VDSM Logs:
>>>> http://www.fpaste.org/69592/00427141/
>>>> ovirt-hosted-engine-ha agent.log
>>>> http://www.fpaste.org/69595/43609139/
>>>>
>>>> It seems to keep failing to start the VM..
>>>> when I
>>>> restart the agent I can see the score drop
>>>> to 0 after
>>>> 3 boot attempts. The interesting thing
>>>> seems to be in
>>>> the VDSM Logs "'Virtual machine does not
>>>> exist',
>>>> 'code': 1}}"
>>>>
>>>> I'm not sure where else to look. Suggestions?
>>>>
>>>> Cheers,
>>>>
>>>> Andrew
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org <mailto:Users at ovirt.org>
>>>> <mailto:Users at ovirt.org <mailto:Users at ovirt.org>>
>>>>
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>>
>>>>
>>>> -- Didi
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org <mailto:Users at ovirt.org>
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>>
>>>> -- Dafna Ron
>>>>
>>>>
>>>>
>>> --
>>> Dafna Ron
>>>
>
--
Dafna Ron
More information about the Users
mailing list