On Tue, Dec 15, 2020 at 11:59 AM Martin Perina <mperina(a)redhat.com> wrote:
Hi,
could you please provide engine.log? And also vdsm.log from a host which
was acting as a fence proxy?
At proxy host (kvm1) I see the following vdsm.log:
2020-12-15 10:13:03,933+0000 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC
call Host.fenceNode failed (error 1) in 0.01 seconds (__init__:312)
2020-12-15 10:13:04,376+0000 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC
call Host.fenceNode failed (error 1) in 0.01 seconds (__init__:312)
2020-12-15 10:13:06,722+0000 INFO (jsonrpc/4) [api.host] FINISH getStats
return={'status': {'message': 'Done', 'code': 0},
'info': {'cpuStatistics':
{'1': {'cpuUser': '2.33', 'nodeIndex': 0,
'cpuSys': '1.13', 'cpuIdle':
'96.54'}, '0': {'cpuUser': '1.66', 'nodeIndex': 0,
'cpuSys': '0.47',
'cpuIdle': '97.87'}, '3': {'cpuUser': '0.73',
'nodeIndex': 0, 'cpuSys':
'0.60', 'cpuIdle': '98.67'}, '2': {'cpuUser':
'1.20', 'nodeIndex': 0,
'cpuSys': '0.40', 'cpuIdle': '98.40'}},
'numaNodeMemFree': {'0':
{'memPercent': 14, 'memFree': '8531'}}, 'memShared': 0,
'haScore': 3400,
'thpState': 'always', 'ksmMergeAcrossNodes': True,
'vmCount': 0, 'memUsed':
'8', 'storageDomains': {u'b4d25e5e-7806-464f-b2e1-4d4ab5a54dee':
{'code':
0, 'actual': True, 'version': 5, 'acquired': True,
'delay': '0.0027973',
'lastCheck': '2.7', 'valid': True},
u'dc4d507b-954f-4da6-bcc3-b4f2633d0fa1': {'code': 0, 'actual':
True,
'version': 5, 'acquired': True, 'delay': '0.00285824',
'lastCheck': '5.7',
'valid': True}}, 'incomingVmMigrations': 0, 'network':
{'ovirtmgmt':
{'rxErrors': '0', 'txErrors': '0', 'speed':
'1000', 'rxDropped': '149',
'name': 'ovirtmgmt', 'tx': '2980375', 'txDropped':
'0', 'duplex':
'unknown', 'sampleTime': 1608027186.703727, 'rx':
'27524740', 'state':
'up'}, 'lo': {'rxErrors': '0', 'txErrors':
'0', 'speed': '1000',
'rxDropped': '0', 'name': 'lo', 'tx':
'1085188922', 'txDropped': '0',
'duplex': 'unknown', 'sampleTime': 1608027186.703727,
'rx': '1085188922',
'state': 'up'}, 'ovs-system': {'rxErrors': '0',
'txErrors': '0', 'speed':
'1000', 'rxDropped': '0', 'name': 'ovs-system',
'tx': '0', 'txDropped':
'0', 'duplex': 'unknown', 'sampleTime': 1608027186.703727,
'rx': '0',
'state': 'down'}, ';vdsmdummy;': {'rxErrors': '0',
'txErrors': '0',
'speed': '1000', 'rxDropped': '0', 'name':
';vdsmdummy;', 'tx': '0',
'txDropped': '0', 'duplex': 'unknown',
'sampleTime': 1608027186.703727,
'rx': '0', 'state': 'down'}, 'br-int':
{'rxErrors': '0', 'txErrors': '0',
'speed': '1000', 'rxDropped': '0', 'name':
'br-int', 'tx': '0',
'txDropped': '0', 'duplex': 'unknown',
'sampleTime': 1608027186.703727,
'rx': '0', 'state': 'down'}, 'eth1':
{'rxErrors': '0', 'txErrors': '0',
'speed': '1000', 'rxDropped': '0', 'name':
'eth1', 'tx': '83685154',
'txDropped': '0', 'duplex': 'unknown',
'sampleTime': 1608027186.703727,
'rx': '300648288', 'state': 'up'}, 'eth0':
{'rxErrors': '0', 'txErrors':
'0', 'speed': '1000', 'rxDropped': '0',
'name': 'eth0', 'tx': '2980933',
'txDropped': '0', 'duplex': 'unknown',
'sampleTime': 1608027186.703727,
'rx': '28271472', 'state': 'up'}}, 'txDropped':
'149', 'anonHugePages':
'182', 'ksmPages': 100, 'elapsedTime': '5717.99',
'cpuLoad': '0.42',
'cpuSys': '0.63', 'diskStats': {'/var/log':
{'free': '16444'},
'/var/run/vdsm/': {'free': '4909'}, '/tmp':
{'free': '16444'}},
'cpuUserVdsmd': '1.33', 'netConfigDirty': 'False',
'memCommitted': 0,
'ksmState': False, 'vmMigrating': 0, 'ksmCpu': 0,
'memAvailable': 9402,
'bootTime': '1608021428', 'haStats': {'active': True,
'configured': True,
'score': 3400, 'localMaintenance': False, 'globalMaintenance':
True},
'momStatus': 'active', 'multipathHealth': {}, 'rxDropped':
'0',
'outgoingVmMigrations': 0, 'swapTotal': 6015, 'swapFree': 6015,
'hugepages': defaultdict(<type 'dict'>, {1048576:
{'resv_hugepages': 0,
'free_hugepages': 0, 'nr_overcommit_hugepages': 0,
'surplus_hugepages': 0,
'vm.free_hugepages': 0, 'nr_hugepages': 0,
'nr_hugepages_mempolicy': 0},
2048: {'resv_hugepages': 0, 'free_hugepages': 0,
'nr_overcommit_hugepages':
0, 'surplus_hugepages': 0, 'vm.free_hugepages': 0, 'nr_hugepages':
0,
'nr_hugepages_mempolicy': 0}}), 'dateTime': '2020-12-15T10:13:06
GMT',
'cpuUser': '1.50', 'memFree': 9146, 'cpuIdle':
'97.87', 'vmActive': 0,
'v2vJobs': {}, 'cpuSysVdsmd': '0.60'}} from=::1,55238 (api:54)
2020-12-15 10:13:07,093+0000 INFO (jsonrpc/1) [api] FINISH getStats
error=Virtual machine does not exist: {'vmId':
u'0167fedb-7445-46bb-a39d-ea4471c86bf4'} (api:129)
2020-12-15 10:13:07,094+0000 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC
call VM.getStats failed (error 1) in 0.00 seconds (__init__:312)
2020-12-15 10:13:07,631+0000 INFO (jsonrpc/3) [api.host] FINISH getStats
return={'status': {'message': 'Done', 'code': 0},
'info': {'cpuStatistics':
{'1': {'cpuUser': '2.33', 'nodeIndex': 0,
'cpuSys': '1.13', 'cpuIdle':
'96.54'}, '0': {'cpuUser': '1.66', 'nodeIndex': 0,
'cpuSys': '0.47',
'cpuIdle': '97.87'}, '3': {'cpuUser': '0.73',
'nodeIndex': 0, 'cpuSys':
'0.60', 'cpuIdle': '98.67'}, '2': {'cpuUser':
'1.20', 'nodeIndex': 0,
'cpuSys': '0.40', 'cpuIdle': '98.40'}},
'numaNodeMemFree': {'0':
{'memPercent': 14, 'memFree': '8531'}}, 'memShared': 0,
'haScore': 3400,
'thpState': 'always', 'ksmMergeAcrossNodes': True,
'vmCount': 0, 'memUsed':
'8', 'storageDomains': {u'b4d25e5e-7806-464f-b2e1-4d4ab5a54dee':
{'code':
0, 'actual': True, 'version': 5, 'acquired': True,
'delay': '0.0027973',
'lastCheck': '3.6', 'valid': True},
u'dc4d507b-954f-4da6-bcc3-b4f2633d0fa1': {'code': 0, 'actual':
True,
'version': 5, 'acquired': True, 'delay': '0.00285824',
'lastCheck': '6.6',
'valid': True}}, 'incomingVmMigrations': 0, 'network':
{'ovirtmgmt':
{'rxErrors': '0', 'txErrors': '0', 'speed':
'1000', 'rxDropped': '149',
'name': 'ovirtmgmt', 'tx': '2985005', 'txDropped':
'0', 'duplex':
'unknown', 'sampleTime': 1608027187.616894, 'rx':
'27525820', 'state':
'up'}, 'lo': {'rxErrors': '0', 'txErrors':
'0', 'speed': '1000',
'rxDropped': '0', 'name': 'lo', 'tx':
'1085195824', 'txDropped': '0',
'duplex': 'unknown', 'sampleTime': 1608027187.616894,
'rx': '1085195824',
'state': 'up'}, 'ovs-system': {'rxErrors': '0',
'txErrors': '0', 'speed':
'1000', 'rxDropped': '0', 'name': 'ovs-system',
'tx': '0', 'txDropped':
'0', 'duplex': 'unknown', 'sampleTime': 1608027187.616894,
'rx': '0',
'state': 'down'}, ';vdsmdummy;': {'rxErrors': '0',
'txErrors': '0',
'speed': '1000', 'rxDropped': '0', 'name':
';vdsmdummy;', 'tx': '0',
'txDropped': '0', 'duplex': 'unknown',
'sampleTime': 1608027187.616894,
'rx': '0', 'state': 'down'}, 'br-int':
{'rxErrors': '0', 'txErrors': '0',
'speed': '1000', 'rxDropped': '0', 'name':
'br-int', 'tx': '0',
'txDropped': '0', 'duplex': 'unknown',
'sampleTime': 1608027187.616894,
'rx': '0', 'state': 'down'}, 'eth1':
{'rxErrors': '0', 'txErrors': '0',
'speed': '1000', 'rxDropped': '0', 'name':
'eth1', 'tx': '83689498',
'txDropped': '0', 'duplex': 'unknown',
'sampleTime': 1608027187.616894,
'rx': '300653876', 'state': 'up'}, 'eth0':
{'rxErrors': '0', 'txErrors':
'0', 'speed': '1000', 'rxDropped': '0',
'name': 'eth0', 'tx': '2985215',
'txDropped': '0', 'duplex': 'unknown',
'sampleTime': 1608027187.616894,
'rx': '28272664', 'state': 'up'}}, 'txDropped':
'149', 'anonHugePages':
'182', 'ksmPages': 100, 'elapsedTime': '5718.91',
'cpuLoad': '0.42',
'cpuSys': '0.63', 'diskStats': {'/var/log':
{'free': '16444'},
'/var/run/vdsm/': {'free': '4909'}, '/tmp':
{'free': '16444'}},
'cpuUserVdsmd': '1.33', 'netConfigDirty': 'False',
'memCommitted': 0,
'ksmState': False, 'vmMigrating': 0, 'ksmCpu': 0,
'memAvailable': 9402,
'bootTime': '1608021428', 'haStats': {'active': True,
'configured': True,
'score': 3400, 'localMaintenance': False, 'globalMaintenance':
True},
'momStatus': 'active', 'multipathHealth': {}, 'rxDropped':
'0',
'outgoingVmMigrations': 0, 'swapTotal': 6015, 'swapFree': 6015,
'hugepages': defaultdict(<type 'dict'>, {1048576:
{'resv_hugepages': 0,
'free_hugepages': 0, 'nr_overcommit_hugepages': 0,
'surplus_hugepages': 0,
'vm.free_hugepages': 0, 'nr_hugepages': 0,
'nr_hugepages_mempolicy': 0},
2048: {'resv_hugepages': 0, 'free_hugepages': 0,
'nr_overcommit_hugepages':
0, 'surplus_hugepages': 0, 'vm.free_hugepages': 0, 'nr_hugepages':
0,
'nr_hugepages_mempolicy': 0}}), 'dateTime': '2020-12-15T10:13:07
GMT',
'cpuUser': '1.50', 'memFree': 9146, 'cpuIdle':
'97.87', 'vmActive': 0,
'v2vJobs': {}, 'cpuSysVdsmd': '0.60'}} from=::1,55238 (api:54)
While at engine I have:
2020-12-15 10:09:57,393Z ERROR
[org.ovirt.engine.core.utils.pm.VdsFenceOptions] (default task-13)
[fa61ae72-bc0c-4487-aeec-2b847877b6b5] Cannot find fence agent named
'fence_xvm' in fence option mapping
2020-12-15 10:09:57,519Z WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-13) [fa61ae72-bc0c-4487-aeec-2b847877b6b5] EVENT_ID:
VDS_ALERT_FENCE_TEST_FAILED(9,001), Power Management test failed for Host
kvm0.lab.local.Internal JSON-RPC error
2020-12-15 10:09:57,519Z INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (default
task-13) [fa61ae72-bc0c-4487-aeec-2b847877b6b5] FINISH, FenceVdsVDSCommand,
return: FenceOperationResult:{status='ERROR', powerStatus='UNKNOWN',
message='Internal JSON-RPC error'}, log id: dc98f7c
2020-12-15 10:09:57,596Z WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-13) [fa61ae72-bc0c-4487-aeec-2b847877b6b5] EVENT_ID:
FENCE_OPERATION_USING_AGENT_AND_PROXY_FAILED(9,021), Execution of power
management status on Host kvm0.lab.local using Proxy Host kvm1.lab.local
and Fence Agent fence_xvm:225.0.0.12 failed.
2020-12-15 10:09:57,596Z WARN
[org.ovirt.engine.core.bll.pm.FenceAgentExecutor] (default task-13)
[fa61ae72-bc0c-4487-aeec-2b847877b6b5] Fence action failed using proxy host
'kvm1.lab.local', trying another proxy
2020-12-15 10:09:57,694Z ERROR
[org.ovirt.engine.core.bll.pm.FenceProxyLocator] (default task-13)
[fa61ae72-bc0c-4487-aeec-2b847877b6b5] Can not run fence action on host
'kvm0.lab.local', no suitable proxy host was found.
2020-12-15 10:09:57,695Z WARN
[org.ovirt.engine.core.bll.pm.FenceAgentExecutor] (default task-13)
[fa61ae72-bc0c-4487-aeec-2b847877b6b5] Failed to find another proxy to
re-run failed fence action, retrying with the same proxy 'kvm1.lab.local'
2020-12-15 10:09:57,695Z ERROR
[org.ovirt.engine.core.utils.pm.VdsFenceOptions] (default task-13)
[fa61ae72-bc0c-4487-aeec-2b847877b6b5] Cannot find fence agent named
'fence_xvm' in fence option mapping
2020-12-15 10:09:57,815Z WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-13) [fa61ae72-bc0c-4487-aeec-2b847877b6b5] EVENT_ID:
VDS_ALERT_FENCE_TEST_FAILED(9,001), Power Management test failed for Host
kvm0.lab.local.Internal JSON-RPC error
2020-12-15 10:09:57,816Z INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (default
task-13) [fa61ae72-bc0c-4487-aeec-2b847877b6b5] FINISH, FenceVdsVDSCommand,
return: FenceOperationResult:{status='ERROR', powerStatus='UNKNOWN',
message='Internal JSON-RPC error'}, log id: 4b58ec5e
2020-12-15 10:09:57,895Z WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-13) [fa61ae72-bc0c-4487-aeec-2b847877b6b5] EVENT_ID:
FENCE_OPERATION_USING_AGENT_AND_PROXY_FAILED(9,021), Execution of power
management status on Host kvm0.lab.local using Proxy Host kvm1.lab.local
and Fence Agent fence_xvm:225.0.0.12 failed.
At engine I had set the fence agent mapping as below (and have restarted
ovirt-engine service):
engine-config -g CustomFenceAgentMapping
CustomFenceAgentMapping: fence_xvm=fence_xvm version: general
Let me know if you need more logs.
I am running ovirt 4.3.10.
Thanks,
Martin
On Tue, Dec 15, 2020 at 10:23 AM Alex K <rightkicktech(a)gmail.com> wrote:
>
>
> On Tue, Dec 15, 2020 at 11:07 AM Alex K <rightkicktech(a)gmail.com> wrote:
>
>>
>>
>> On Mon, Dec 14, 2020 at 8:59 PM Strahil Nikolov <hunter86_bg(a)yahoo.com>
>> wrote:
>>
>>> Fence_xvm requires a key is deployed on both the Host and the VMs in
>>> order to succeed. What is happening when you use the cli on any of the VMs ?
>>> Also, the VMs require an open tcp port to receive the necessary output
>>> of each request.I
>>
>> I deployed keys at the physical host and virtual hosts, as per
>>
https://github.com/rightkick/Notes/blob/master/Ovirt-fence_xmv.md
>> I can get the VM status from the virtual hosts:
>>
>> [root@kvm1 cluster]# fence_xvm -a 225.0.0.12 -k
>> /etc/cluster/fence_xvm.key -H ovirt-node0 -o status
>> Status: ON
>> You have new mail in /var/spool/mail/root
>> [root@kvm1 cluster]# fence_xvm -a 225.0.0.12 -k
>> /etc/cluster/fence_xvm.key -H ovirt-node1 -o status
>> Status: ON
>>
>> kvm0 and kvm1 are the hostnames of each virtual host, while ovirt-node0
>> and ovirt-node1 are the domain names of the same virtual hosts as defined
>> at virsh.
>>
>
> I am passing also the port/domain option at GUI, but from logs it seems
> it is being ignored as it is not being logged from engine.
>
> [image: image.png]
> tried also domain=ovirt-node0 with same results.
>
>
>
>>> Best Regards,
>>> Strahil Nikolov
>>>
>>>
>>>
>>>
>>>
>>>
>>> В понеделник, 14 декември 2020 г., 10:57:11 Гринуич+2, Alex K <
>>> rightkicktech(a)gmail.com> написа:
>>>
>>>
>>>
>>>
>>>
>>> Hi friends,
>>>
>>> I was wondering what is needed to setup fence_xvm in order to use for
>>> power management in virtual nested environments for testing purposes.
>>>
>>> I have followed the following steps:
>>>
https://github.com/rightkick/Notes/blob/master/Ovirt-fence_xmv.md
>>>
>>> I tried also engine-config -s
>>> CustomFenceAgentMapping="fence_xvm=_fence_xvm"
>>> From command line all seems fine and I can get the status of the host
>>> VMs, but I was not able to find what is needed to set this up at engine UI:
>>>
>>>
>>> At username and pass I just filled dummy values as they should not be
>>> needed for fence_xvm.
>>> I always get an error at GUI while engine logs give:
>>>
>>>
>>> 2020-12-14 08:53:48,343Z WARN
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (default task-4) [07c1d540-6d8d-419c-affb-181495d75759] EVENT_ID:
>>> VDS_ALERT_FENCE_TEST_FAILED(9,001), Power Management test failed for Host
>>> kvm0.lab.local.Internal JSON-RPC error
>>> 2020-12-14 08:53:48,343Z INFO
>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (default
>>> task-4) [07c1d540-6d8d-419c-affb-181495d75759] FINISH, FenceVdsVDSCommand,
>>> return: FenceOperationResult:{status='ERROR',
powerStatus='UNKNOWN',
>>> message='Internal JSON-RPC error'}, log id: 2437b13c
>>> 2020-12-14 08:53:48,400Z WARN
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (default task-4) [07c1d540-6d8d-419c-affb-181495d75759] EVENT_ID:
>>> FENCE_OPERATION_USING_AGENT_AND_PROXY_FAILED(9,021), Execution of power
>>> management status on Host kvm0.lab.local using Proxy Host kvm1.lab.local
>>> and Fence Agent fence_xvm:225.0.0.12 failed.
>>> 2020-12-14 08:53:48,400Z WARN
>>> [org.ovirt.engine.core.bll.pm.FenceAgentExecutor] (default task-4)
>>> [07c1d540-6d8d-419c-affb-181495d75759] Fence action failed using proxy host
>>> 'kvm1.lab.local', trying another proxy
>>> 2020-12-14 08:53:48,485Z ERROR
[org.ovirt.engine.core.bll.pm.FenceProxyLocator]
>>> (default task-4) [07c1d540-6d8d-419c-affb-181495d75759] Can not run fence
>>> action on host 'kvm0.lab.local', no suitable proxy host was found.
>>> 2020-12-14 08:53:48,486Z WARN
>>> [org.ovirt.engine.core.bll.pm.FenceAgentExecutor] (default task-4)
>>> [07c1d540-6d8d-419c-affb-181495d75759] Failed to find another proxy to
>>> re-run failed fence action, retrying with the same proxy
'kvm1.lab.local'
>>> 2020-12-14 08:53:48,582Z WARN
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (default task-4) [07c1d540-6d8d-419c-affb-181495d75759] EVENT_ID:
>>> VDS_ALERT_FENCE_TEST_FAILED(9,001), Power Management test failed for Host
>>> kvm0.lab.local.Internal JSON-RPC error
>>> 2020-12-14 08:53:48,582Z INFO
>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (default
>>> task-4) [07c1d540-6d8d-419c-affb-181495d75759] FINISH, FenceVdsVDSCommand,
>>> return: FenceOperationResult:{status='ERROR',
powerStatus='UNKNOWN',
>>> message='Internal JSON-RPC error'}, log id: 8607bc9
>>> 2020-12-14 08:53:48,637Z WARN
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (default task-4) [07c1d540-6d8d-419c-affb-181495d75759] EVENT_ID:
>>> FENCE_OPERATION_USING_AGENT_AND_PROXY_FAILED(9,021), Execution of power
>>> management status on Host kvm0.lab.local using Proxy Host kvm1.lab.local
>>> and Fence Agent fence_xvm:225.0.0.12 failed.
>>>
>>>
>>> Any idea?
>>>
>>> Thanx,
>>> Alex
>>>
>>>
>>> _______________________________________________
>>> Users mailing list -- users(a)ovirt.org
>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>> oVirt Code of Conduct:
>>>
https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/B7IHC4MYY5L...
>>>
>> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
>
https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
>
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MV3RI22LE4C...
>
--
Martin Perina
Manager, Software Engineering
Red Hat Czech s.r.o.