[ovirt-users] Ovirt backups lead to unresponsive VM

Alex K rightkicktech at gmail.com
Fri Feb 9 17:36:47 UTC 2018


Hi all,

In case you need any further logs let me know.
Thanx for the time.

Alex

On Thu, Feb 8, 2018 at 9:41 AM, Alex K <rightkicktech at gmail.com> wrote:

> Hi Shani,
>
> Didn't notice that.
> I am attaching later vdsm logs.
>
> Thanx,
> Alex
>
> On Wed, Feb 7, 2018 at 5:31 PM, Shani Leviim <sleviim at redhat.com> wrote:
>
>> Hi Alex,
>> Sorry for the mail's delay.
>>
>> From a brief look at your logs, I've noticed that the error you've got at
>> the engine's log was logged at 2018-02-03 00:22:56,
>> while your vdsm's log ends at 2018-02-03 00:01:01.
>> Is there a way you can reproduce a fuller vdsm log?
>>
>>
>> *Regards,*
>>
>> *Shani Leviim*
>>
>> On Sat, Feb 3, 2018 at 5:41 PM, Alex K <rightkicktech at gmail.com> wrote:
>>
>>> Attaching vdm log from host that trigerred the error, where the Vm that
>>> was being cloned was running at that time.
>>>
>>> thanx,
>>> Alex
>>>
>>> On Sat, Feb 3, 2018 at 5:20 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>>
>>>>
>>>>
>>>> On Feb 3, 2018 3:24 PM, "Alex K" <rightkicktech at gmail.com> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I have reproduced the backups failure. The VM that failed is named
>>>> Win-FileServer and is a Windows 2016 server 64bit with 300GB of disk.
>>>> During the cloning step the VM went unresponsive and I had to
>>>> stop/start it.
>>>> I am attaching the logs.I have another VM with same OS (named DC-Server
>>>> within the logs) but with smaller disk (60GB) which does not give any error
>>>> when it is cloned.
>>>> I see a line:
>>>>
>>>> EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null,
>>>> Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: VDSM
>>>> v2.sitedomain command SnapshotVDS failed: Message timeout which can be
>>>> caused by communication issues
>>>>
>>>>
>>>> I suggest adding relevant vdsm.log as well.
>>>> Y.
>>>>
>>>>
>>>> I appreciate any advise why I am facing such issue with the backups.
>>>>
>>>> thanx,
>>>> Alex
>>>>
>>>> On Tue, Jan 30, 2018 at 12:49 AM, Alex K <rightkicktech at gmail.com>
>>>> wrote:
>>>>
>>>>> Ok. I will reproduce and collect logs.
>>>>>
>>>>> Thanx,
>>>>> Alex
>>>>>
>>>>> On Jan 29, 2018 20:21, "Mahdi Adnan" <mahdi.adnan at outlook.com> wrote:
>>>>>
>>>>> I have Windows VMs, both client and server.
>>>>> if you provide the engine.log file we might have a look at it.
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Respectfully
>>>>> *Mahdi A. Mahdi*
>>>>>
>>>>> ------------------------------
>>>>> *From:* Alex K <rightkicktech at gmail.com>
>>>>> *Sent:* Monday, January 29, 2018 5:40 PM
>>>>> *To:* Mahdi Adnan
>>>>> *Cc:* users
>>>>> *Subject:* Re: [ovirt-users] Ovirt backups lead to unresponsive VM
>>>>>
>>>>> Hi,
>>>>>
>>>>> I have observed this logged at host when the issue occurs:
>>>>>
>>>>> VDSM command GetStoragePoolInfoVDS failed: Connection reset by peer
>>>>>
>>>>> or
>>>>>
>>>>> VDSM host.domain command GetStatsVDS failed: Connection reset by peer
>>>>>
>>>>> At engine logs have not been able to correlate.
>>>>>
>>>>> Are you hosting Windows 2016 server and Windows 10 VMs?
>>>>> The weird is that I have same setup on other clusters with no issues.
>>>>>
>>>>> Thanx,
>>>>> Alex
>>>>>
>>>>> On Sun, Jan 28, 2018 at 9:21 PM, Mahdi Adnan <mahdi.adnan at outlook.com>
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> We have a cluster of 17 nodes, backed by GlusterFS storage, and using
>>>>> this same script for backup.
>>>>> we have no issues with it so far.
>>>>> have you checked engine log file ?
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Respectfully
>>>>> *Mahdi A. Mahdi*
>>>>>
>>>>> ------------------------------
>>>>> *From:* users-bounces at ovirt.org <users-bounces at ovirt.org> on behalf
>>>>> of Alex K <rightkicktech at gmail.com>
>>>>> *Sent:* Wednesday, January 24, 2018 4:18 PM
>>>>> *To:* users
>>>>> *Subject:* [ovirt-users] Ovirt backups lead to unresponsive VM
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I have a cluster with 3 nodes, using ovirt 4.1 in a self hosted setup
>>>>> on top glusterfs.
>>>>> On some VMs (especially one Windows server 2016 64bit with 500 GB of
>>>>> disk). Guest agents are installed at VMs. i almost always observe that
>>>>> during the backup of the VM the VM is rendered unresponsive (dashboard
>>>>> shows a question mark at the VM status and VM does not respond to ping or
>>>>> to anything).
>>>>>
>>>>> For scheduled backups I use:
>>>>>
>>>>> https://github.com/wefixit-AT/oVirtBackup
>>>>>
>>>>> The script does the following:
>>>>>
>>>>> 1. snapshot VM (this is done ok without any failure)
>>>>>
>>>>> 2. Clone snapshot (this steps renders the VM unresponsive)
>>>>>
>>>>> 3. Export Clone
>>>>>
>>>>> 4. Delete clone
>>>>>
>>>>> 5. Delete snapshot
>>>>>
>>>>>
>>>>> Do you have any similar experience? Any suggestions to address this?
>>>>>
>>>>> I have never seen such issue with hosted Linux VMs.
>>>>>
>>>>> The cluster has enough storage to accommodate the clone.
>>>>>
>>>>>
>>>>> Thanx,
>>>>>
>>>>> Alex
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20180209/ff181b71/attachment.html>


More information about the Users mailing list