OK.
You have several issues in the setup... so it's a bit tricky...
you had a problem with your storage on the 14th of Jan and one of the
hosts rebooted (if you have the vdsm log from that day than I can see
what happened on vdsm side)
in engine, I could see a problem with the export domain and this should
not have cause a reboot. Can you tell me if you had a problem with the
data domain as well or was it just the export domain? were you having
any vm's exported/imported at that time?
In any case - this is a bug.
As for the vm's - if the vm's are no longer in migrating state than
please restart ovirt-engine service (looks like a cache issue)
2014-01-14 09:38:08,590 INFO
[org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo]
(DefaultQuartzScheduler_Worker-34) RefreshVmList vm id
2736197b-6dc3-4155-9a29-9306ca64881d status = Down on vds
ignoring it in the re
fresh until migration is done
if they are in migrating state - there should have been a timeout a long
time ago.
can you please run 'vdsClient -s 0 list table' and 'virsh -r list' on
both all hosts?
Last thing is that your ISO domain seems to be having issues as well.
This should not effect the host status but if any of the vm's were
booted from an iso or have an iso attached in the boot sequence this
will explain the migration issue.
Thanks,
Dafna
On 01/28/2014 09:28 AM, Neil wrote:
Hi guys,
Sorry for the very late reply, I've been out of the office doing installations.
Unfortunately due to the time delay, my oldest logs are only as far
back as the attached.
I've only grep'd for Thread-286029 in the vdsm log. The engine.log I'm
not sure what info is required, so the full log is attached.
Please shout if you need any info or further details.
Thank you very much.
Regards.
Neil Wilson.
On Fri, Jan 24, 2014 at 10:55 AM, Meital Bourvine <mbourvin(a)redhat.com> wrote:
> Could you please attach the engine.log from the same time?
>
> thanks!
>
> ----- Original Message -----
>> From: "Neil" <nwilson123(a)gmail.com>
>> To: dron(a)redhat.com
>> Cc: "users" <users(a)ovirt.org>
>> Sent: Wednesday, January 22, 2014 1:14:25 PM
>> Subject: Re: [Users] Vm's being paused
>>
>> Hi Dafna,
>>
>> Thanks.
>>
>> The vdsm logs are quite large, so I've only attached the logs for the
>> pause of the VM called Babbage on the 19th of Jan.
>>
>> As for snapshots, Babbage has one from June 2013 and Reports has two
>> from June and Oct 2013.
>>
>> I'm using FC storage, with 11 VM's and 3 nodes/hosts, 9 of the 11
VM's
>> have thin provisioned disks.
>>
>> Please shout if you'd like any further info or logs.
>>
>> Thank you.
>>
>> Regards.
>>
>> Neil Wilson.
>>
>>
>>
>>
>>
>> On Wed, Jan 22, 2014 at 10:58 AM, Dafna Ron <dron(a)redhat.com> wrote:
>>> Hi Neil,
>>>
>>> Can you please attach the vdsm logs?
>>> also, as for the vm's, do they have any snapshots?
>>> from your suggestion to allocate more luns, are you using iscsi or FC?
>>>
>>> Thanks,
>>>
>>> Dafna
>>>
>>>
>>> On 01/22/2014 08:45 AM, Neil wrote:
>>>> Thanks for the replies guys,
>>>>
>>>> Looking at my two VM's that have paused so far through the oVirt GUI
>>>> the following sizes show under Disks.
>>>>
>>>> VM Reports:
>>>> Virtual Size 35GB, Actual Size 41GB
>>>> Looking on the Centos OS side, Disk size is 33G and used is 12G with
>>>> 19G available (40%) usage.
>>>>
>>>> VM Babbage:
>>>> Virtual Size is 40GB, Actual Size 53GB
>>>> On the Server 2003 OS side, Disk size is 39.9Gb and used is 16.3G, so
>>>> under 50% usage.
>>>>
>>>>
>>>> Do you see any issues with the above stats?
>>>>
>>>> Then my main Datacenter storage is as follows...
>>>>
>>>> Size: 6887 GB
>>>> Available: 1948 GB
>>>> Used: 4939 GB
>>>> Allocated: 1196 GB
>>>> Over Allocation: 61%
>>>>
>>>> Could there be a problem here? I can allocate additional LUNS if you
>>>> feel the space isn't correctly allocated.
>>>>
>>>> Apologies for going on about this, but I'm really concerned that
>>>> something isn't right and I might have a serious problem if an
>>>> important machine locks up.
>>>>
>>>> Thank you and much appreciated.
>>>>
>>>> Regards.
>>>>
>>>> Neil Wilson.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jan 21, 2014 at 7:02 PM, Dafna Ron <dron(a)redhat.com>
wrote:
>>>>> the storage space is configured in percentages and not physical
size.
>>>>> so if 20G is less than 10% (default config) of your storage it will
pause
>>>>> the vms regardless of how much GB you still have.
>>>>> this is configurable though so you can change it to less than 10% if
you
>>>>> like.
>>>>>
>>>>> to answer the second question, vm's will not pause on ENOSpace
error if
>>>>> they
>>>>> run out of space internally but only if the external storage cannot
be
>>>>> consumed. so only if you run out of space in the storage and and not
if
>>>>> vm
>>>>> runs out of space in its on fs.
>>>>>
>>>>>
>>>>>
>>>>> On 01/21/2014 09:51 AM, Neil wrote:
>>>>>> Hi Dan,
>>>>>>
>>>>>> Sorry, attached is engine.log I've taken out the two sections
where
>>>>>> each of the VM's were paused.
>>>>>>
>>>>>> Does the error "VM babbage has paused due to no Storage
space error"
>>>>>> mean the main storage domain has run out of storage, or that the
VM
>>>>>> has run out?
>>>>>>
>>>>>> Both VM's appear to have been running on node01 when they
were paused.
>>>>>> My vdsm versions are all...
>>>>>>
>>>>>> vdsm-cli-4.13.0-11.el6.noarch
>>>>>> vdsm-python-cpopen-4.13.0-11.el6.x86_64
>>>>>> vdsm-xmlrpc-4.13.0-11.el6.noarch
>>>>>> vdsm-4.13.0-11.el6.x86_64
>>>>>> vdsm-python-4.13.0-11.el6.x86_64
>>>>>>
>>>>>> I currently have a 61% over allocation ratio on my primary
storage
>>>>>> domain, with 1948GB available.
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>> Regards.
>>>>>>
>>>>>> Neil Wilson.
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 21, 2014 at 11:24 AM, Neil
<nwilson123(a)gmail.com> wrote:
>>>>>>> Hi Dan,
>>>>>>>
>>>>>>> Sorry for only coming back to you now.
>>>>>>> The VM's are thin provisioned. The Server 2003 VM
hasn't run out of
>>>>>>> disk space there is about 20Gigs free, and the usage barely
grows as
>>>>>>> the VM only shares printers. The other VM that paused is also
on thin
>>>>>>> provisioned disks and also has plenty space, this guest is
running
>>>>>>> Centos 6.3 64bit and only runs basic reporting.
>>>>>>>
>>>>>>> After the 2003 guest was rebooted, the network card showed up
as
>>>>>>> unplugged in ovirt, and we had to remove it, and re-add it
again in
>>>>>>> order to correct the issue. The Centos VM did not have the
same issue.
>>>>>>>
>>>>>>> I'm concerned that this might happen to a VM that's
quite critical,
>>>>>>> any thoughts or ideas?
>>>>>>>
>>>>>>> The only recent changes have been updating from Dreyou 3.2 to
the
>>>>>>> official Centos repo and updating to 3.3.1-2. Prior to
updating I
>>>>>>> haven't had this issue.
>>>>>>>
>>>>>>> Any assistance is greatly appreciated.
>>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Regards.
>>>>>>>
>>>>>>> Neil Wilson.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Jan 19, 2014 at 8:20 PM, Dan Yasny
<dyasny(a)gmail.com> wrote:
>>>>>>>> Do you have the VMs on thin provisioned storage or sparse
disks?
>>>>>>>>
>>>>>>>> Pausing happens when the VM has an IO error or runs out
of space on
>>>>>>>> the
>>>>>>>> storage domain, and it is done intentionally, so that the
VM will not
>>>>>>>> experience a disk corruption. If you have thin
provisioned disks, and
>>>>>>>> the VM
>>>>>>>> writes to it's disks faster than the disks can grow,
this is exactly
>>>>>>>> what
>>>>>>>> you will see
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Jan 19, 2014 at 10:04 AM, Neil
<nwilson123(a)gmail.com> wrote:
>>>>>>>>> Hi guys,
>>>>>>>>>
>>>>>>>>> I've had two different Vm's randomly pause
this past week and inside
>>>>>>>>> ovirt
>>>>>>>>> the error received is something like 'vm ran out
of storage and was
>>>>>>>>> paused'.
>>>>>>>>> Resuming the vm's didn't work and I had to
force them off and then on
>>>>>>>>> which
>>>>>>>>> resolved the issue.
>>>>>>>>>
>>>>>>>>> Has anyone had this issue before?
>>>>>>>>>
>>>>>>>>> I realise this is very vague so if you could please
let me know which
>>>>>>>>> logs
>>>>>>>>> to send in.
>>>>>>>>>
>>>>>>>>> Thank you
>>>>>>>>>
>>>>>>>>> Regards.
>>>>>>>>>
>>>>>>>>> Neil Wilson
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Users mailing list
>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Users mailing list
>>>>>>>> Users(a)ovirt.org
>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>> --
>>>>> Dafna Ron
>>>
>>>
>>> --
>>> Dafna Ron
>> _______________________________________________
>> Users mailing list
>> Users(a)ovirt.org
>>
http://lists.ovirt.org/mailman/listinfo/users
>>