
Hello, I'm configuring a virtual rhcs cluster and would like to use fence_rhevm agent for stonith. As VMs composing the 4-nodes cluster I'm using CentOS 7.4 os with fence-agents-rhevm-4.0.11-66.el7_4.4.x86_64; I see that in 7.5 the agent is fence-agents-rhevm-4.0.11-86.el7_5.2.x86_64 but with no modification for the /usr/sbin/fence_rhevm python script apart the BUILD_DATE line. Some questions: - it seems it is still in API v3 even if deprecated; any particular reason? Possible to update to V4? Can I create a RFE bugzilla? This is in fact what I get in engine events registering connection: Client from address "10.4.4.68" is using version 3 of the API, which has been deprecated since version 4.0 of the engine, and will no longer be supported starting with version 4.3. Make sure to update that client to use a supported versions of the API and the SDKs, before upgrading to version 4.3 of the engine. 7/7/184:56:11 PM User fenceuser@internal-authz connecting from '10.4.4.68' using session 'OVgdzMofRFDS4ZKSdL83mRyGUFEdc+++onJHzGiAfpYuS07xa/EbBqFEPtztpwEeRzCn9mBOTGXE69rBbHlhXQ==' logged in. 7/7/184:56:11 PM - UserRole ok Back in April 2017 for version 4.1.1 I had problems and it seems I had to set super user privileges for the "fencing user" See thread here https://lists.ovirt.org/archives/list/users@ovirt.org/message/FS5YFU5ZXIYDC5... Now instead I only set UserRole for the defined "fenceuser" on the virtual cluster VMS and it works ok - for a VM with permissions already setup: [root@cl1 ~]# fence_rhevm -a 10.4.192.49 -l "fenceuser@internal" -S /usr/local/bin/pwd_ovmgr01.sh -z --ssl-insecure -o status --shell-timeout=20 --power-wait=10 -n cl1 Status: ON - for a VM still without permissions [root@cl1 ~]# fence_rhevm -a 10.4.192.49 -l "fenceuser@internal" -S /usr/local/bin/pwd_ovmgr01.sh -z --ssl-insecure -o status --shell-timeout=20 --power-wait=10 -n cl2 Failed: Unable to obtain correct plug status or plug is not available - for a VM with permissions already setup: I'm able to make power off / power on of the VM Thanks, Gianluca

On Sat, Jul 7, 2018 at 5:14 PM, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Hello, I'm configuring a virtual rhcs cluster and would like to use fence_rhevm agent for stonith. As VMs composing the 4-nodes cluster I'm using CentOS 7.4 os with fence-agents-rhevm-4.0.11-66.el7_4.4.x86_64; I see that in 7.5 the agent is fence-agents-rhevm-4.0.11-86.el7_5.2.x86_64 but with no modification for the /usr/sbin/fence_rhevm python script apart the BUILD_DATE line.
Some questions: - it seems it is still in API v3 even if deprecated; any particular reason? Possible to update to V4? Can I create a RFE bugzilla?
We already have a bug for that: https://bugzilla.redhat.com/show_bug.cgi?id=1402862 Let's hope it will be delivered in CentOS 7.6
This is in fact what I get in engine events registering connection:
Client from address "10.4.4.68" is using version 3 of the API, which has been deprecated since version 4.0 of the engine, and will no longer be supported starting with version 4.3. Make sure to update that client to use a supported versions of the API and the SDKs, before upgrading to version 4.3 of the engine. 7/7/184:56:11 PM
We need to get this message fixed, we already know that APIv3 will not be removed in oVirt 4.3, created https://bugzilla.redhat.com/1599054 to track that
User fenceuser@internal-authz connecting from '10.4.4.68' using session ' OVgdzMofRFDS4ZKSdL83mRyGUFEdc+++onJHzGiAfpYuS07xa/ EbBqFEPtztpwEeRzCn9mBOTGXE69rBbHlhXQ==' logged in. 7/7/184:56:11 PM
- UserRole ok Back in April 2017 for version 4.1.1 I had problems and it seems I had to set super user privileges for the "fencing user" See thread here https://lists.ovirt.org/archives/list/users@ovirt.org/message/ FS5YFU5ZXIYDC5SWQY4MZS65UDKSX7JS/
Now instead I only set UserRole for the defined "fenceuser" on the virtual cluster VMS and it works ok
- for a VM with permissions already setup: [root@cl1 ~]# fence_rhevm -a 10.4.192.49 -l "fenceuser@internal" -S /usr/local/bin/pwd_ovmgr01.sh -z --ssl-insecure -o status --shell-timeout=20 --power-wait=10 -n cl1 Status: ON
- for a VM still without permissions [root@cl1 ~]# fence_rhevm -a 10.4.192.49 -l "fenceuser@internal" -S /usr/local/bin/pwd_ovmgr01.sh -z --ssl-insecure -o status --shell-timeout=20 --power-wait=10 -n cl2 Failed: Unable to obtain correct plug status or plug is not available
- for a VM with permissions already setup: I'm able to make power off / power on of the VM
Eli, please take a look at above, that might be the issue you saw with fence_rhevm
Thanks, Gianluca
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community- guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/ message/YIZUMYY5OFHIOBYVALGDKOEDYM5KMPGY/
-- Martin Perina Associate Manager, Software Engineering Red Hat Czech s.r.o.

On Sun, Jul 8, 2018 at 1:16 PM, Martin Perina <mperina@redhat.com> wrote: Hi, thanks for your previous answers - UserRole ok
Back in April 2017 for version 4.1.1 I had problems and it seems I had to set super user privileges for the "fencing user" See thread here https://lists.ovirt.org/archives/list/users@ovirt.org/messag e/FS5YFU5ZXIYDC5SWQY4MZS65UDKSX7JS/
Now instead I only set UserRole for the defined "fenceuser" on the virtual cluster VMS and it works ok
- for a VM with permissions already setup: [root@cl1 ~]# fence_rhevm -a 10.4.192.49 -l "fenceuser@internal" -S /usr/local/bin/pwd_ovmgr01.sh -z --ssl-insecure -o status --shell-timeout=20 --power-wait=10 -n cl1 Status: ON
- for a VM still without permissions [root@cl1 ~]# fence_rhevm -a 10.4.192.49 -l "fenceuser@internal" -S /usr/local/bin/pwd_ovmgr01.sh -z --ssl-insecure -o status --shell-timeout=20 --power-wait=10 -n cl2 Failed: Unable to obtain correct plug status or plug is not available
- for a VM with permissions already setup: I'm able to make power off / power on of the VM
Eli, please take a look at above, that might be the issue you saw with fence_rhevm
Perhaps I didn't explain clear enough. My note was to confirm that now in my tests it is sufficient to create a user in internal domain and assign it the UserRole permission for the cluster VMs, to be able to use that user for fencing VMs. While, back in 4.1.1 I was forced to create a user with superuser rights. Or did you mean anything else when asking for Eli input? In my case, to have the fencing agent work in my 4-virtualnodes CentOS 7.4 cluster, I created the fencing agent this way: pcs stonith create vmfence fence_rhevm pcmk_host_map="intracl1:cl1;intracl2:cl2;intracl3:cl3;intracl4:cl4" \ ipaddr=10.4.192.49 ssl=1 ssl_insecure=1 login="fenceuser@internal" passwd_script="/usr/local/bin/pwd_ovmgr01.sh" \ shell_timeout=10 power_wait=10 So that now I have: [root@cl1 ~]# pcs stonith show vmfence Resource: vmfence (class=stonith type=fence_rhevm) Attributes: ipaddr=10.4.192.49 login=fenceuser@internal passwd_script=/usr/local/bin/pwd_ovmgr01.sh pcmk_host_map=intracl1:cl1;intracl2:cl2;intracl3:cl3;intracl4:cl4 power_wait=10 shell_timeout=10 ssl=1 ssl_insecure=1 Operations: monitor interval=60s (vmfence-monitor-interval-60s) [root@cl1 ~]# I forced a panic on a node that was providing a cluster group of resources and it was correctly fenced (powered off / powered on) with the service relocated on another node (after power off action has been completed) [root@cl1 ~]# echo 1 > /proc/sys/kernel/sysrq [root@cl1 ~]# echo c > /proc/sysrq-trigger and this is the chain of events I see in the mean time. The VM cl1 is indeed powered off and then on and the service relocated on cl2 in this case and lastly cl1 rejoins the cluster after finishing power on phase. I don't know if the error messages I see every time fence takes action are related to more nodes trying to fence and conflicting or what... Jul 8, 2018, 3:23:02 PM User fenceuser@internal-authz connecting from '10.4.4.69' using session 'FC9UfrgCj9BM/CW5o6iymyMhqBXUDnNJWD20QGEqLccCMC/qYlsv4vC0SBRSlbNrtfdRCx2QmoipOdNk0UrsHQ==' logged in. Jul 8, 2018, 3:22:37 PM VM cl1 started on Host ov200 Jul 8, 2018, 3:22:21 PM User fenceuser@internal-authz connecting from '10.4.4.63' using session '6nMvcZNYs+aRBxifeA2aBaMsWVMCehCx1LLQdV5AEyM/Zrx/YihERxfLPc2KZPrvivy86rS+ml1Ic6BqnIKBNw==' logged in. Jul 8, 2018, 3:22:11 PM VM cl1 was started by fenceuser@internal-authz (Host: ov200). Jul 8, 2018, 3:22:10 PM VM cl1 is down. Exit message: Admin shut down from the engine Jul 8, 2018, 3:22:10 PM User fenceuser@internal-authz connecting from '10.4.4.63' using session 'YCvbpVTy9fWAl+UB2g6hlJgqECvCWZYT0cvMxlgBTzcO2LosBh8oGPxsXBP/Y8TN0x7tYSfjxKr4al1g246nnA==' logged in. Jul 8, 2018, 3:22:10 PM Failed to power off VM cl1 (Host: ov200, User: fenceuser@internal-authz). Jul 8, 2018, 3:22:10 PM Failed to power off VM cl1 (Host: ov200, User: fenceuser@internal-authz). Jul 8, 2018, 3:22:10 PM VDSM ov200 command DestroyVDS failed: Virtual machine does not exist: {'vmId': u'ff45a524-3363-4266-89fe-dfdbb87a8256'} Jul 8, 2018, 3:22:10 PM VM cl1 powered off by fenceuser@internal-authz (Host: ov200). Jul 8, 2018, 3:21:59 PM User fenceuser@internal-authz connecting from '10.4.4.63' using session 'tVmcRoYhzWOMgh/1smPidV0wFme2f5jWx3wdWDlpG7xHeUhTu4QJJk+l7u8zW2S73LK3lzai/kPtreSanBmAIA==' logged in. Jul 8, 2018, 3:21:48 PM User fenceuser@internal-authz connecting from '10.4.4.69' using session 'iFBOa5vdfJBu4aqVdqJJbwm - BTW: the monitoring action every 60 seconds generates many lines in events list. Is there a way to bypass this eventually? Gianluca
participants (2)
-
Gianluca Cecchi
-
Martin Perina