Lack of resources on the ML server
by Marc Dequènes (Duck)
Quack,
Sandro noticed some posts did not appear quickly on the interface.
There's quite a lot of traffic and it was not keeping up, so I added a
vCPU (2->3), but I had to restart the VM but it was fast.
I also noticed some MM crontab maintenance jobs did trigger an OOM
killer, so I added a bit more RAM (4GB->5GB).
Another thing to consider is: we also filter outgoing mails, which is
nice to avoid being blacklisted if some bad guy subscribe and post
spammy content. But it seems postfix is applying a recipient concurrency
of one when sending to the filter, which means if the list has x
subscribers then it is checked x times, which is silly. There is no such
thing in the configuration so I'm looking into it. This probably affects
other installations but this one being heavily loaded…
\_o<
6 years, 4 months
Re: [ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [
28-06-2018 ] [ 098_ovirt_provider_ovn.use_ovn_provider.]
by Dafna Ron
Thanks Alona,
Can you please update me once you have a fix?
Thanks,
Dafna
On Thu, Jun 28, 2018 at 10:28 AM, Alona Kaplan <alkaplan(a)redhat.com> wrote:
> Hi,
> I'm aware to the error. Francesco and me are working on it.
>
> Thank,
> Alona.
>
> On Thu, Jun 28, 2018, 12:23 Dafna Ron <dron(a)redhat.com> wrote:
>
>> ovirt-hosted-engine-ha failed on the same issue as well.
>>
>> On Thu, Jun 28, 2018 at 10:07 AM, Dafna Ron <dron(a)redhat.com> wrote:
>>
>>> Hi,
>>>
>>> We had a failure in test 098_ovirt_provider_ovn.use_ovn_provider.
>>>
>>> Although CQ is pointing to this change: https://gerrit.ovirt.org/#/c/
>>> 92567/ - packaging: Add python-netaddr requirement I actually think
>>> from the error its because of changes made to multiqueues
>>>
>>> https://gerrit.ovirt.org/#/c/92009/ - engine: Update libvirtVmXml to
>>> consider vmBase.multiQueuesEnabled attribute
>>> https://gerrit.ovirt.org/#/c/92008/ - engine: Introduce algorithm for
>>> calculating how many queues asign per vnic
>>> https://gerrit.ovirt.org/#/c/92007/ - engine: Add multiQueuesEnabled to
>>> VmBase
>>> https://gerrit.ovirt.org/#/c/92318/ - restapi: Add 'Multi Queues
>>> Enabled' to the relevant mappers
>>> https://gerrit.ovirt.org/#/c/92149/ - webadmin: Add 'Multi Queues
>>> Enabled' to vm dialog
>>>
>>> Alona, can you please take a look?
>>>
>>>
>>> *Link to Job:*
>>>
>>>
>>>
>>>
>>> *http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/8375/
>>> <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/8375/>Link
>>> to all
>>> logs:https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/8375/...
>>> <https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/8375/artif...>(Relevant)
>>> error snippet from the log: <error>*
>>> *eng**ine: *
>>>
>>> 2018-06-27 13:59:25,976-04 ERROR [org.ovirt.engine.core.
>>> vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-80)
>>> [] Command 'GetAllVmStatsVDSCommand(HostName =
>>> lago-basic-suite-master-host-1, VdsIdVDSCommandParametersBase:
>>> {hostId='d9094c95-3275-4616-b4c2-815e753bcfed'})' execution failed:
>>> VDSGenericException: VDSNetworkException: Broken pipe
>>> 2018-06-27 13:59:25,977-04 DEBUG [org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil]
>>> (EE-ManagedThreadFactory-engine-Thread-442) [] Executing task:
>>> EE-ManagedThreadFactory-engine-Thread-442
>>> 2018-06-27 13:59:25,977-04 DEBUG [org.ovirt.engine.core.common.
>>> di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engine-Thread-442)
>>> [] method: getVdsManager, params: [d9094c95-3275-4616-b4c2-815e753bcfed],
>>> timeElapsed: 0ms
>>> 2018-06-27 13:59:25,977-04 WARN [org.ovirt.engine.core.vdsbroker.VdsManager]
>>> (EE-ManagedThreadFactory-engine-Thread-442) [] Host
>>> 'lago-basic-suite-master-host-1' is not responding.
>>> 2018-06-27 13:59:25,979-04 ERROR [org.ovirt.engine.core.dal.
>>> dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-63)
>>> [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM
>>> lago-basic-suite-master-host-1 command GetStatsAsyncVDS failed: Broken pipe
>>> 2018-06-27 13:59:25,976-04 DEBUG [org.ovirt.engine.core.
>>> vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-80)
>>> [] Exception: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
>>> VDSGenericException: VDSNetworkException: Broken pipe
>>> at org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.
>>> proceedProxyReturnValue(BrokerCommandBase.java:189) [vdsbroker.jar:]
>>> at org.ovirt.engine.core.vdsbroker.vdsbroker.
>>> GetAllVmStatsVDSCommand.executeVdsBrokerCommand(
>>> GetAllVmStatsVDSCommand.java:23) [vdsbroker.jar:]
>>> at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.
>>> executeVdsCommandWithNetworkEvent(VdsBrokerCommand.java:123)
>>> [vdsbroker.jar:]
>>> at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.
>>> executeVDSCommand(VdsBrokerCommand.java:111) [vdsbroker.jar:]
>>> at org.ovirt.engine.core.vdsbroker.VDSCommandBase.
>>> executeCommand(VDSCommandBase.java:65) [vdsbroker.jar:]
>>> at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:31)
>>> [dal.jar:]
>>> at org.ovirt.engine.core.vdsbroker.vdsbroker.
>>> DefaultVdsCommandExecutor.execute(DefaultVdsCommandExecutor.java:14)
>>> [vdsbroker.jar:]
>>> at org.ovirt.engine.core.vdsbroker.ResourceManager.
>>> runVdsCommand(ResourceManager.java:399) [vdsbroker.jar:]
>>> at org.ovirt.engine.core.vdsbroker.ResourceManager$
>>> Proxy$_$$_WeldSubclass.runVdsCommand$$super(Unknown Source)
>>> [vdsbroker.jar:]
>>> at sun.reflect.GeneratedMethodAccessor270.invoke(Unknown
>>> Source) [:1.8.0_171]
>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
>>> DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_171]
>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>> [rt.jar:1.8.0_171]
>>> at org.jboss.weld.interceptor.proxy.
>>> TerminalAroundInvokeInvocationContext.proceedInternal(
>>> TerminalAroundInvokeInvocationContext.java:49)
>>> [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
>>> at org.jboss.weld.interceptor.proxy.
>>> AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:77)
>>> [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
>>> at org.ovirt.engine.core.common.di.interceptor.
>>> LoggingInterceptor.apply(LoggingInterceptor.java:12) [common.jar:]
>>> at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source)
>>> [:1.8.0_171]
>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
>>> DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_171]
>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>> [rt.jar:1.8.0_171]
>>> at org.jboss.weld.interceptor.reader.
>>> SimpleInterceptorInvocation$SimpleMethodInvocation.invoke(
>>> SimpleInterceptorInvocation.java:73) [weld-core-impl-2.4.3.Final.
>>> jar:2.4.3.Final]
>>> at org.jboss.weld.interceptor.proxy.InterceptorMethodHandler.
>>> executeAroundInvoke(InterceptorMethodHandler.java:84)
>>> [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
>>> at org.jboss.weld.interceptor.proxy.InterceptorMethodHandler.
>>> executeInterception(InterceptorMethodHandler.java:72)
>>> [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
>>> at org.jboss.weld.interceptor.proxy.InterceptorMethodHandler.
>>> invoke(InterceptorMethodHandler.java:56) [weld-core-impl-2.4.3.Final.
>>> jar:2.4.3.Final]
>>> at org.jboss.weld.bean.proxy.CombinedInterceptorAndDecorato
>>> rStackMethodHandler.invoke(CombinedInterceptorAndDecoratorStackMethodHandler.java:79)
>>> [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
>>> at org.jboss.weld.bean.proxy.CombinedInterceptorAndDecorato
>>> rStackMethodHandler.invoke(CombinedInterceptorAndDecoratorStackMethodHandler.java:68)
>>> [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
>>> at org.ovirt.engine.core.vdsbroker.ResourceManager$
>>> Proxy$_$$_WeldSubclass.runVdsCommand(Unknown Source) [vdsbroker.jar:]
>>> at org.ovirt.engine.core.vdsbroker.monitoring.
>>> VmsStatisticsFetcher.poll(VmsStatisticsFetcher.java:29) [vdsbroker.jar:]
>>> at org.ovirt.engine.core.vdsbroker.monitoring.
>>> VmsListFetcher.fetch(VmsListFetcher.java:49) [vdsbroker.jar:]
>>> at org.ovirt.engine.core.vdsbroker.monitoring.
>>> PollVmStatsRefresher.poll(PollVmStatsRefresher.java:44) [vdsbroker.jar:]
>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>> [rt.jar:1.8.0_171]
>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>> [rt.jar:1.8.0_171]
>>> at org.glassfish.enterprise.concurrent.internal.
>>> ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.
>>> access$201(ManagedScheduledThreadPoolExecutor.java:383)
>>> [javax.enterprise.concurrent-1.0.jar:]
>>> at org.glassfish.enterprise.concurrent.internal.
>>> ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(
>>> ManagedScheduledThreadPoolExecutor.java:534)
>>> [javax.enterprise.concurrent-1.0.jar:]
>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>> [rt.jar:1.8.0_171]
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>> [rt.jar:1.8.0_171]
>>> at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_171]
>>> at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$
>>> ManagedThread.run(ManagedThreadFactoryImpl.java:250)
>>> [javax.enterprise.concurrent-1.0.jar:]
>>> at org.jboss.as.ee.concurrent.service.
>>> ElytronManagedThreadFactory$ElytronManagedThread.run(
>>> ElytronManagedThreadFactory.java:78)
>>>
>>> 2018-06-27 13:59:25,984-04 DEBUG [org.ovirt.engine.core.
>>> vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-80)
>>> [] FINISH, GetAllVmStatsVDSCommand, return: , log id: 56d99e77
>>> 2018-06-27 13:59:25,984-04 DEBUG [org.ovirt.engine.core.common.
>>> di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-80)
>>> [] method: runVdsCommand, params: [GetAllVmStats,
>>> VdsIdVDSCommandParametersBase:{hostId='d9094c95-3275-4616-b4c2-815e753bcfed'}],
>>> timeElapsed: 1497ms
>>> 2018-06-27 13:59:25,984-04 INFO [org.ovirt.engine.core.
>>> vdsbroker.monitoring.PollVmStatsRefresher] (EE-ManagedThreadFactory-engineScheduled-Thread-80)
>>> [] Failed to fetch vms info for host 'lago-basic-suite-master-host-1' -
>>> skipping VMs monitoring.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *vdsm: 2018-06-27 14:10:17,314-0400 INFO (jsonrpc/7) [virt.vm]
>>> (vmId='b8a11304-07e3-4e64-af35-7421be780d5b') Hotunplug NIC xml: <?xml
>>> version='1.0' encoding='utf-8'?><interface type="bridge"> <address
>>> bus="0x00" domain="0x0000" function="0x0" slot="0x0b" type="pci" /> <mac
>>> address="00:1a:4a:16:01:0e" /> <model type="virtio" /> <source
>>> bridge="network_1" /> <link state="up" /> <driver name="vhost"
>>> queues="" /> <alias name="ua-3c77476f-f194-476a-8412-d76a9e58d1f9"
>>> /></interface> (vm:3321)2018-06-27 14:10:17,328-0400 ERROR (jsonrpc/7)
>>> [virt.vm] (vmId='b8a11304-07e3-4e64-af35-7421be780d5b') Hotunplug failed
>>> (vm:3353)Traceback (most recent call last): File
>>> "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3343, in
>>> hotunplugNic self._dom.detachDevice(nicXml) File
>>> "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 99, in f
>>> ret = attr(*args, **kwargs) File
>>> "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line
>>> 131, in wrapper ret = f(*args, **kwargs) File
>>> "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 93, in
>>> wrapper return func(inst, *args, **kwargs) File
>>> "/usr/lib64/python2.7/site-packages/libvirt.py", line 1177, in
>>> detachDevice if ret == -1: raise libvirtError ('virDomainDetachDevice()
>>> failed', dom=self)libvirtError: 'queues' attribute must be positive number:
>>> 2018-06-27 14:10:17,345-0400 DEBUG (jsonrpc/7) [api] FINISH hotunplugNic
>>> response={'status': {'message': "'queues' attribute must be positive
>>> number: ", 'code': 50}} (api:136)2018-06-27 14:10:17,346-0400 INFO
>>> (jsonrpc/7) [api.virt] FINISH hotunplugNic return={'status': {'message':
>>> "'queues' attribute must be positive number: ", 'code': 50}}
>>> from=::ffff:192.168.201.4,32976, flow_id=ecb6652,
>>> vmId=b8a11304-07e3-4e64-af35-7421be780d5b (api:53)2018-06-27
>>> 14:10:17,346-0400 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call
>>> VM.hotunplugNic failed (error 50) in 0.07 seconds (__init__:311)2018-06-27
>>> 14:10:19,244-0400 DEBUG (qgapoller/2) [vds] Not sending QEMU-GA command
>>> 'guest-get-users' to vm_id='b8a11304-07e3-4e64-af35-7421be780d5b', command
>>> is not supported (qemuguestagent:192)2018-06-27 14:10:20,038-0400 DEBUG
>>> (jsonrpc/1) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmStats' in bridge
>>> with {} (__init__:328)2018-06-27 14:10:20,038-0400 INFO (jsonrpc/1)
>>> [api.host] START getAllVmStats() from=::1,48032 (api:47)2018-06-27
>>> 14:10:20,041-0400 INFO (jsonrpc/1) [api.host] FINISH getAllVmStats
>>> return={'status': {'message': 'Done', 'code': 0}, 'statsList':
>>> (suppressed)} from=::1,48032 (api:53)2018-06-27 14:10:20,043-0400 DEBUG
>>> (jsonrpc/1) [jsonrpc.JsonRpcServer] Return 'Host.getAllVmStats' in bridge
>>> with (suppressed) (__init__:355)2018-06-27 14:10:20,043-0400 INFO
>>> (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded
>>> in 0.00 seconds (__init__:311)2018-06-27 14:10:20,057-0400 DEBUG
>>> (jsonrpc/6) [jsonrpc.JsonRpcServer] Calling 'Host.getAllVmIoTunePolicies'
>>> in bridge with {} (__init__:328)2018-06-27 14:10:20,058-0400 INFO
>>> (jsonrpc/6) [api.host] START getAllVmIoTunePolicies() from=::1,48032
>>> (api:47)2018-06-27 14:10:20,058-0400 INFO (jsonrpc/6) [api.host] FINISH
>>> getAllVmIoTunePolicies return={'status': {'message': 'Done', 'code': 0},
>>> 'io_tune_policies_dict': {'b8a11304-07e3-4e64-af35-7421be780d5b':
>>> {'policy': [], 'current_values': [{'ioTune': {'write_bytes_sec': 0L,
>>> 'total_iops_sec': 0L, 'read_iops_sec': 0L, 'read_bytes_sec': 0L,
>>> 'write_iops_sec': 0L, 'total_bytes_sec': 0L}, 'path':
>>> '/rhev/data-center/mnt/blockSD/cf23ceeb-81a3-4714-85a0-c6ddd1e024da/images/650fe4ae-47a1-4f2d-9cba-1617a8c868c3/03e75c3c-24e7-4e68-a6f1-21728aaaa73e',
>>> 'name': 'vda'}]}}} from=::1,48032 (api:53)2018-06-27 14:10:20,059-0400
>>> DEBUG (jsonrpc/6) [jsonrpc.JsonRpcServer] Return
>>> 'Host.getAllVmIoTunePolicies' in bridge with
>>> {'b8a11304-07e3-4e64-af35-7421be780d5b': {'policy': [], 'current_values':
>>> [{'ioTune': {'write_bytes_sec': 0L, 'total_iops_sec': 0L, 'read_iops_sec':
>>> 0L, 'read_bytes_sec': 0L, 'write_iops_sec': 0L, 'total_bytes_sec': 0L},
>>> 'path':
>>> '/rhev/data-center/mnt/blockSD/cf23ceeb-81a3-4714-85a0-c6ddd1e024da/images/650fe4ae-47a1-4f2d-9cba-1617a8c868c3/03e75c3c-24e7-4e68-a6f1-21728aaaa73e',
>>> 'name': 'vda'}]}} (__init__:355)</error>*
>>>
>>
>>
6 years, 4 months