[ovirt-users] Ovirt backups lead to unresponsive VM

Alex K rightkicktech at gmail.com
Thu Feb 8 07:41:01 UTC 2018


Hi Shani,

Didn't notice that.
I am attaching later vdsm logs.

Thanx,
Alex

On Wed, Feb 7, 2018 at 5:31 PM, Shani Leviim <sleviim at redhat.com> wrote:

> Hi Alex,
> Sorry for the mail's delay.
>
> From a brief look at your logs, I've noticed that the error you've got at
> the engine's log was logged at 2018-02-03 00:22:56,
> while your vdsm's log ends at 2018-02-03 00:01:01.
> Is there a way you can reproduce a fuller vdsm log?
>
>
> *Regards,*
>
> *Shani Leviim*
>
> On Sat, Feb 3, 2018 at 5:41 PM, Alex K <rightkicktech at gmail.com> wrote:
>
>> Attaching vdm log from host that trigerred the error, where the Vm that
>> was being cloned was running at that time.
>>
>> thanx,
>> Alex
>>
>> On Sat, Feb 3, 2018 at 5:20 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>
>>>
>>>
>>> On Feb 3, 2018 3:24 PM, "Alex K" <rightkicktech at gmail.com> wrote:
>>>
>>> Hi All,
>>>
>>> I have reproduced the backups failure. The VM that failed is named
>>> Win-FileServer and is a Windows 2016 server 64bit with 300GB of disk.
>>> During the cloning step the VM went unresponsive and I had to stop/start
>>> it.
>>> I am attaching the logs.I have another VM with same OS (named DC-Server
>>> within the logs) but with smaller disk (60GB) which does not give any error
>>> when it is cloned.
>>> I see a line:
>>>
>>> EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null,
>>> Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: VDSM
>>> v2.sitedomain command SnapshotVDS failed: Message timeout which can be
>>> caused by communication issues
>>>
>>>
>>> I suggest adding relevant vdsm.log as well.
>>> Y.
>>>
>>>
>>> I appreciate any advise why I am facing such issue with the backups.
>>>
>>> thanx,
>>> Alex
>>>
>>> On Tue, Jan 30, 2018 at 12:49 AM, Alex K <rightkicktech at gmail.com>
>>> wrote:
>>>
>>>> Ok. I will reproduce and collect logs.
>>>>
>>>> Thanx,
>>>> Alex
>>>>
>>>> On Jan 29, 2018 20:21, "Mahdi Adnan" <mahdi.adnan at outlook.com> wrote:
>>>>
>>>> I have Windows VMs, both client and server.
>>>> if you provide the engine.log file we might have a look at it.
>>>>
>>>>
>>>> --
>>>>
>>>> Respectfully
>>>> *Mahdi A. Mahdi*
>>>>
>>>> ------------------------------
>>>> *From:* Alex K <rightkicktech at gmail.com>
>>>> *Sent:* Monday, January 29, 2018 5:40 PM
>>>> *To:* Mahdi Adnan
>>>> *Cc:* users
>>>> *Subject:* Re: [ovirt-users] Ovirt backups lead to unresponsive VM
>>>>
>>>> Hi,
>>>>
>>>> I have observed this logged at host when the issue occurs:
>>>>
>>>> VDSM command GetStoragePoolInfoVDS failed: Connection reset by peer
>>>>
>>>> or
>>>>
>>>> VDSM host.domain command GetStatsVDS failed: Connection reset by peer
>>>>
>>>> At engine logs have not been able to correlate.
>>>>
>>>> Are you hosting Windows 2016 server and Windows 10 VMs?
>>>> The weird is that I have same setup on other clusters with no issues.
>>>>
>>>> Thanx,
>>>> Alex
>>>>
>>>> On Sun, Jan 28, 2018 at 9:21 PM, Mahdi Adnan <mahdi.adnan at outlook.com>
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> We have a cluster of 17 nodes, backed by GlusterFS storage, and using
>>>> this same script for backup.
>>>> we have no issues with it so far.
>>>> have you checked engine log file ?
>>>>
>>>>
>>>> --
>>>>
>>>> Respectfully
>>>> *Mahdi A. Mahdi*
>>>>
>>>> ------------------------------
>>>> *From:* users-bounces at ovirt.org <users-bounces at ovirt.org> on behalf of
>>>> Alex K <rightkicktech at gmail.com>
>>>> *Sent:* Wednesday, January 24, 2018 4:18 PM
>>>> *To:* users
>>>> *Subject:* [ovirt-users] Ovirt backups lead to unresponsive VM
>>>>
>>>> Hi all,
>>>>
>>>> I have a cluster with 3 nodes, using ovirt 4.1 in a self hosted setup
>>>> on top glusterfs.
>>>> On some VMs (especially one Windows server 2016 64bit with 500 GB of
>>>> disk). Guest agents are installed at VMs. i almost always observe that
>>>> during the backup of the VM the VM is rendered unresponsive (dashboard
>>>> shows a question mark at the VM status and VM does not respond to ping or
>>>> to anything).
>>>>
>>>> For scheduled backups I use:
>>>>
>>>> https://github.com/wefixit-AT/oVirtBackup
>>>>
>>>> The script does the following:
>>>>
>>>> 1. snapshot VM (this is done ok without any failure)
>>>>
>>>> 2. Clone snapshot (this steps renders the VM unresponsive)
>>>>
>>>> 3. Export Clone
>>>>
>>>> 4. Delete clone
>>>>
>>>> 5. Delete snapshot
>>>>
>>>>
>>>> Do you have any similar experience? Any suggestions to address this?
>>>>
>>>> I have never seen such issue with hosted Linux VMs.
>>>>
>>>> The cluster has enough storage to accommodate the clone.
>>>>
>>>>
>>>> Thanx,
>>>>
>>>> Alex
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20180208/7ee34a72/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vdsm.log.5
Type: application/octet-stream
Size: 5920945 bytes
Desc: not available
URL: <http://lists.ovirt.org/pipermail/users/attachments/20180208/7ee34a72/attachment-0001.obj>


More information about the Users mailing list