VM Unknow Status

One of node is got non-responsive suddenly and some VM stuck on Unknow Status, I am trying to change status but unable to login in DB. su - postgres psql engine psql command not found error. Can someone help me to get rid of it? Thanks, Ankit Sharma

Hi, Here is my instruction: https://www.youtube.com/watch?v=vm55caHxRj8 <https://www.youtube.com/watch?v=vm55caHxRj8>
On 10 Jan 2024, at 19:31, ankit--- via Users <users@ovirt.org <mailto:users@ovirt.org>> wrote:
One of node is got non-responsive suddenly and some VM stuck on Unknow Status, I am trying to change status but unable to login in DB.
su - postgres
psql engine psql command not found error.
Can someone help me to get rid of it?
Thanks, Ankit Sharma _______________________________________________ Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html <https://www.ovirt.org/privacy-policy.html> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ <https://www.ovirt.org/community/about/community-guidelines/> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/CROU2NZCAILXRA... <https://lists.ovirt.org/archives/list/users@ovirt.org/message/CROU2NZCAILXRACSDFHFSPNK4ARB3H2B/>

Hi Andrei, Many thanks. I was able to change the unknown VM's status to down and after that able to move on another node. 1. But still non-responsive node is not able to go in Maintenace mode. It gives VM error "Error while executing action: Cannot switch Host to Maintenance mode. Host still has running VMs on it and is in Non-Responsive state." 2. nodectl check Status: OK Bootloader ... OK Layer boot entries ... OK Valid boot entries ... OK Mount points ... OK Separate /var ... OK Discard is used ... OK Basic storage ... OK Initialized VG ... OK Initialized Thin Pool ... OK Initialized LVs ... OK Thin storage ... OK Checking available space in thinpool ... OK Checking thinpool auto-extend ... OK vdsmd ... OK ===================================================== 3. service vdsmd status Redirecting to /bin/systemctl status vdsmd.service Unit vdsmd.service could not be found. 4. I have checked vdsm packages installed in working node and non-working node and found that somehow following packages are missing.. vdsm-4.30.46-1.el7.x86_64 vdsm-gluster-4.30.46-1.el7.x86_64 vdsm-hook-ethtool-options-4.30.46-1.el7.noarch vdsm-hook-vmfex-dev-4.30.46-1.el7.noarch vdsm-hook-fcoe-4.30.46-1.el7.noarch 5.When i tried to install this package "vdsm-4.30.46-1.el7.x86_64" so gets error"Error: Package: vdsm-4.30.46-1.el7.x86_64 (/vdsm-4.30.46-1.el7.x86_64) Requires: vdsm-hook-vmfex-dev = 4.30.46-1.el7" and if i tried to install this vdsm-hook-vmfex-dev = 4.30.46-1.el7 so get error : Error: Package: vdsm-hook-fcoe-4.30.46-1.el7.noarch (/vdsm-hook-fcoe-4.30.46-1.el7.noarch) Requires: vdsm = 4.30.46-1.el7. So, how can reinstall either vdsm or remove node and reinsert successfully? Regards, Ankit Sharma

Hi, Can you just restart node? So you manually did delete some packages in problematic node? It so it was not a good idea.
On 11 Jan 2024, at 12:38, ankit--- via Users <users@ovirt.org> wrote:
Hi Andrei,
Many thanks. I was able to change the unknown VM's status to down and after that able to move on another node.
1. But still non-responsive node is not able to go in Maintenace mode. It gives VM error "Error while executing action: Cannot switch Host to Maintenance mode. Host still has running VMs on it and is in Non-Responsive state."
2. nodectl check
Status: OK Bootloader ... OK Layer boot entries ... OK Valid boot entries ... OK Mount points ... OK Separate /var ... OK Discard is used ... OK Basic storage ... OK Initialized VG ... OK Initialized Thin Pool ... OK Initialized LVs ... OK Thin storage ... OK Checking available space in thinpool ... OK Checking thinpool auto-extend ... OK vdsmd ... OK =====================================================
3. service vdsmd status Redirecting to /bin/systemctl status vdsmd.service Unit vdsmd.service could not be found.
4. I have checked vdsm packages installed in working node and non-working node and found that somehow following packages are missing..
vdsm-4.30.46-1.el7.x86_64 vdsm-gluster-4.30.46-1.el7.x86_64 vdsm-hook-ethtool-options-4.30.46-1.el7.noarch vdsm-hook-vmfex-dev-4.30.46-1.el7.noarch vdsm-hook-fcoe-4.30.46-1.el7.noarch
5.When i tried to install this package "vdsm-4.30.46-1.el7.x86_64" so gets error"Error: Package: vdsm-4.30.46-1.el7.x86_64 (/vdsm-4.30.46-1.el7.x86_64) Requires: vdsm-hook-vmfex-dev = 4.30.46-1.el7"
and
if i tried to install this vdsm-hook-vmfex-dev = 4.30.46-1.el7 so get error : Error: Package: vdsm-hook-fcoe-4.30.46-1.el7.noarch (/vdsm-hook-fcoe-4.30.46-1.el7.noarch) Requires: vdsm = 4.30.46-1.el7.
So, how can reinstall either vdsm or remove node and reinsert successfully?
Regards, Ankit Sharma _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/X24UYAMO4KDOKP...

Hi Andrei, FYI .. On 4th Jan 2024 this node itself rebooted. 1. Yes, i did already restart node. 2. i have tried to delete SNMP as it has some issue because it was getting timeout intermittently and noticed that vdsm warning and found and compared packages from working node some packages were missing. But after reboot vdsm status is ok. 3. What is table name for host or node in the database? Actually, the node is non-responsive mode when i tried to bring it into maintenance mode so it's giving an error that some VM's are still running on this host so cannot go in maintenance mode but it in host>>virtual machine is showing 7 VM, however it has no VM running on it. So tried to change the status from database. But I am unable to find a table or say where I need to change or update. Can you please suggest where i need to update the status of host>>virtual machine. Regards, Ankit Sharma

Hi, You have to ask Sandro or Arik on this list, I don’t know. PS. Can you just shut down this node, force remove, and reinstall again ?
On 12 Jan 2024, at 11:17, ankit--- via Users <users@ovirt.org> wrote:
Hi Andrei,
FYI .. On 4th Jan 2024 this node itself rebooted.
1. Yes, i did already restart node.
2. i have tried to delete SNMP as it has some issue because it was getting timeout intermittently and noticed that vdsm warning and found and compared packages from working node some packages were missing. But after reboot vdsm status is ok.
3. What is table name for host or node in the database? Actually, the node is non-responsive mode when i tried to bring it into maintenance mode so it's giving an error that some VM's are still running on this host so cannot go in maintenance mode but it in host>>virtual machine is showing 7 VM, however it has no VM running on it. So tried to change the status from database. But I am unable to find a table or say where I need to change or update.
Can you please suggest where i need to update the status of host>>virtual machine.
Regards, Ankit Sharma _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/3TRKDY3G3TB5WF...

Hi Andrei, I got the host table name and changed the status of VM counts. select vm_count from vds_dynamic; Let me try to reinstall it and revert to you, my findings. Regards, Ankit Sharma
participants (2)
-
Andrei Verovski
-
ankit@eurus.net