Can't install Ovirt from cockpit - ERROR: cannot retrieve iSCSI luns
by Patrick Lomakin
Latest version of Ovirt - 4.4.3
Hi everyboby. I have tryed to install hosted engine from ovirt cockpit, but when select iSCSI target I give the same error - Error: Cannot retrieve iSCSI LUNs. After that I have installed hosted engine from console with command - "hosted-engine --deploy" and than installed successfully. From console was able to retrieve iSCSI luns and select one there. Who can try to deploy ovirt from cockpit?
4 years
Guest OS Memory Free/Cached/Buffered: Not Configured
by Stefan Seifried
Hi,
I'm quite new to oVirt, so my apologizies if I'm asking something dead obvious:
I noticed that there is an item in the 'General Tab' of each VM, which says 'Guest OS Memory Free/Cached/Buffered' and on all my VM's it says 'Not Configured'. Right now I'm trying to figure out how to enable this feature. I assume that this gives me the equivalent output of executing 'free' on the shell on a Linux guest.
Googling and digging around the VMM guide did not give me any pointers so far.
Thanks in advance,
Stefan
PS: A little background info: I have one 'client' which keeps nagging me to increase the RAM on his VM because it's constantly operating at 95% memory load (as shown on the VM dashboard). After a quick investigation with 'free' I could see that Linux has built up the disk cache to 12G (from 16G total, no swapping occured). My intention is to make the real memory load visible to him, as he has already access to the VM portal for shutdown/restart/etc.
4 years
Re: Constantly XFS in memory corruption inside VMs
by Strahil Nikolov
Are you using "nobarrier" mount options in the VM ?
If yes, can you try to remove the "nobarrrier" option.
Best Regards,
Strahil Nikolov
В събота, 28 ноември 2020 г., 19:25:48 Гринуич+2, Vinícius Ferrão <ferrao(a)versatushpc.com.br> написа:
Hi Strahil,
I moved a running VM to other host, rebooted and no corruption was found. If there's any corruption it may be silent corruption... I've cases where the VM was new, just installed, run dnf -y update to get the updated packages, rebooted, and boom XFS corruption. So perhaps the motion process isn't the one to blame.
But, in fact, I remember when moving a VM that it went down during the process and when I rebooted it was corrupted. But this may not seems related. It perhaps was already in a inconsistent state.
Anyway, here's the mount options:
Host1:
192.168.10.14:/mnt/pool0/ovirt/vm on /rhev/data-center/mnt/192.168.10.14:_mnt_pool0_ovirt_vm type nfs4 (rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,soft,nosharecache,proto=tcp,timeo=100,retrans=3,sec=sys,clientaddr=192.168.10.1,local_lock=none,addr=192.168.10.14)
Host2:
192.168.10.14:/mnt/pool0/ovirt/vm on /rhev/data-center/mnt/192.168.10.14:_mnt_pool0_ovirt_vm type nfs4 (rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,soft,nosharecache,proto=tcp,timeo=100,retrans=3,sec=sys,clientaddr=192.168.10.1,local_lock=none,addr=192.168.10.14)
The options are the default ones. I haven't changed anything when configuring this cluster.
Thanks.
-----Original Message-----
From: Strahil Nikolov <hunter86_bg(a)yahoo.com>
Sent: Saturday, November 28, 2020 1:54 PM
To: users <users(a)ovirt.org>; Vinícius Ferrão <ferrao(a)versatushpc.com.br>
Subject: Re: [ovirt-users] Constantly XFS in memory corruption inside VMs
Can you try with a test vm, if this happens after a Virtual Machine migration ?
What are your mount options for the storage domain ?
Best Regards,
Strahil Nikolov
В събота, 28 ноември 2020 г., 18:25:15 Гринуич+2, Vinícius Ferrão via Users <users(a)ovirt.org> написа:
Hello,
I’m trying to discover why an oVirt 4.4.3 Cluster with two hosts and NFS shared storage on TrueNAS 12.0 is constantly getting XFS corruption inside the VMs.
For random reasons VM’s gets corrupted, sometimes halting it or just being silent corrupted and after a reboot the system is unable to boot due to “corruption of in-memory data detected”. Sometimes the corrupted data are “all zeroes”, sometimes there’s data there. In extreme cases the XFS superblock 0 get’s corrupted and the system cannot even detect a XFS partition anymore since the magic XFS key is corrupted on the first blocks of the virtual disk.
This is happening for a month now. We had to rollback some backups, and I don’t trust anymore on the state of the VMs.
Using xfs_db I can see that some VM’s have corrupted superblocks but the VM is up. One in specific, was with sb0 corrupted, so I knew when a reboot kicks in the machine will be gone, and that’s exactly what happened.
Another day I was just installing a new CentOS 8 VM for random reasons, and after running dnf -y update and a reboot the VM was corrupted needing XFS repair. That was an extreme case.
So, I’ve looked on the TrueNAS logs, and there’s apparently nothing wrong on the system. No errors logged on dmesg, nothing on /var/log/messages and no errors on the “zpools”, not even after scrub operations. On the switch, a Catalyst 2960X, we’ve been monitoring it and all it’s interfaces. There are no “up and down” and zero errors on all interfaces (we have a 4x Port LACP on the TrueNAS side and 2x Port LACP on each hosts), everything seems to be fine. The only metric that I was unable to get is “dropped packages”, but I’m don’t know if this can be an issue or not.
Finally, on oVirt, I can’t find anything either. I looked on /var/log/messages and /var/log/sanlock.log but there’s nothing that I found suspicious.
Is there’s anyone out there experiencing this? Our VM’s are mainly CentOS 7/8 with XFS, there’s 3 Windows VM’s that does not seems to be affected, everything else is affected.
Thanks all.
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/VLYSE7HCFNW...
4 years
Re: Constantly XFS in memory corruption inside VMs
by Strahil Nikolov
Can you try with a test vm, if this happens after a Virtual Machine migration ?
What are your mount options for the storage domain ?
Best Regards,
Strahil Nikolov
В събота, 28 ноември 2020 г., 18:25:15 Гринуич+2, Vinícius Ferrão via Users <users(a)ovirt.org> написа:
Hello,
I’m trying to discover why an oVirt 4.4.3 Cluster with two hosts and NFS shared storage on TrueNAS 12.0 is constantly getting XFS corruption inside the VMs.
For random reasons VM’s gets corrupted, sometimes halting it or just being silent corrupted and after a reboot the system is unable to boot due to “corruption of in-memory data detected”. Sometimes the corrupted data are “all zeroes”, sometimes there’s data there. In extreme cases the XFS superblock 0 get’s corrupted and the system cannot even detect a XFS partition anymore since the magic XFS key is corrupted on the first blocks of the virtual disk.
This is happening for a month now. We had to rollback some backups, and I don’t trust anymore on the state of the VMs.
Using xfs_db I can see that some VM’s have corrupted superblocks but the VM is up. One in specific, was with sb0 corrupted, so I knew when a reboot kicks in the machine will be gone, and that’s exactly what happened.
Another day I was just installing a new CentOS 8 VM for random reasons, and after running dnf -y update and a reboot the VM was corrupted needing XFS repair. That was an extreme case.
So, I’ve looked on the TrueNAS logs, and there’s apparently nothing wrong on the system. No errors logged on dmesg, nothing on /var/log/messages and no errors on the “zpools”, not even after scrub operations. On the switch, a Catalyst 2960X, we’ve been monitoring it and all it’s interfaces. There are no “up and down” and zero errors on all interfaces (we have a 4x Port LACP on the TrueNAS side and 2x Port LACP on each hosts), everything seems to be fine. The only metric that I was unable to get is “dropped packages”, but I’m don’t know if this can be an issue or not.
Finally, on oVirt, I can’t find anything either. I looked on /var/log/messages and /var/log/sanlock.log but there’s nothing that I found suspicious.
Is there’s anyone out there experiencing this? Our VM’s are mainly CentOS 7/8 with XFS, there’s 3 Windows VM’s that does not seems to be affected, everything else is affected.
Thanks all.
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/VLYSE7HCFNW...
4 years
Constantly XFS in memory corruption inside VMs
by Vinícius Ferrão
Hello,
I'm trying to discover why an oVirt 4.4.3 Cluster with two hosts and NFS shared storage on TrueNAS 12.0 is constantly getting XFS corruption inside the VMs.
For random reasons VM's gets corrupted, sometimes halting it or just being silent corrupted and after a reboot the system is unable to boot due to "corruption of in-memory data detected". Sometimes the corrupted data are "all zeroes", sometimes there's data there. In extreme cases the XFS superblock 0 get's corrupted and the system cannot even detect a XFS partition anymore since the magic XFS key is corrupted on the first blocks of the virtual disk.
This is happening for a month now. We had to rollback some backups, and I don't trust anymore on the state of the VMs.
Using xfs_db I can see that some VM's have corrupted superblocks but the VM is up. One in specific, was with sb0 corrupted, so I knew when a reboot kicks in the machine will be gone, and that's exactly what happened.
Another day I was just installing a new CentOS 8 VM for random reasons, and after running dnf -y update and a reboot the VM was corrupted needing XFS repair. That was an extreme case.
So, I've looked on the TrueNAS logs, and there's apparently nothing wrong on the system. No errors logged on dmesg, nothing on /var/log/messages and no errors on the "zpools", not even after scrub operations. On the switch, a Catalyst 2960X, we've been monitoring it and all it's interfaces. There are no "up and down" and zero errors on all interfaces (we have a 4x Port LACP on the TrueNAS side and 2x Port LACP on each hosts), everything seems to be fine. The only metric that I was unable to get is "dropped packages", but I'm don't know if this can be an issue or not.
Finally, on oVirt, I can't find anything either. I looked on /var/log/messages and /var/log/sanlock.log but there's nothing that I found suspicious.
Is there's anyone out there experiencing this? Our VM's are mainly CentOS 7/8 with XFS, there's 3 Windows VM's that does not seems to be affected, everything else is affected.
Thanks all.
4 years
Ovirt 4.4.3 - Unable to start hosted engine
by Marco Marino
Hi,
I have an ovirt 4.4.3 with 2 clusters, hosted engine and iscsi storage.
First cluster, composed of 2 servers (host1 and host2), is dedicated to the
hosted engine, the second cluster is for vms. Furthermore, there is a SAN
with 3 luns: one for hosted engine storage, one for vms and one unused. My
SAN is built on top of a pacemaker/drbd cluster with 2 nodes with a virtual
ip used as iscsi Portal IP. Starting from today, after a failover of the
iscsi cluster, I'm unable to start the hosted engine. It seems that there
is some problem with storage.
Actually I have only one node (host1) running in the cluster. It seems
there is some lock on lvs, but I'm not sure of this.
Here are some details about the problem:
1. iscsiadm -m session
iSCSI Transport Class version 2.0-870
version 6.2.0.878-2
Target: iqn.2003-01.org.linux-iscsi.s1-node1.x8664:sn.2a734f67d5b1
(non-flash)
Current Portal: 10.3.8.8:3260,1
Persistent Portal: 10.3.8.8:3260,1
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:4b668221d9a9
Iface IPaddress: 10.3.8.10
Iface HWaddress: default
Iface Netdev: default
SID: 1
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
*********
Timeouts:
*********
Recovery Timeout: 5
Target Reset Timeout: 30
LUN Reset Timeout: 30
Abort Timeout: 15
*****
CHAP:
*****
username: <empty>
password: ********
username_in: <empty>
password_in: ********
************************
Negotiated iSCSI params:
************************
HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 262144
MaxXmitDataSegmentLength: 262144
FirstBurstLength: 65536
MaxBurstLength: 262144
ImmediateData: Yes
InitialR2T: Yes
MaxOutstandingR2T: 1
************************
Attached SCSI devices:
************************
Host Number: 7 State: running
scsi7 Channel 00 Id 0 Lun: 0
Attached scsi disk sdb State: running
scsi7 Channel 00 Id 0 Lun: 1
Attached scsi disk sdc State: running
2. vdsm.log errors:
2020-11-27 18:37:16,786+0100 INFO (jsonrpc/0) [api] FINISH getStats
error=Virtual machine does not exist: {'vmId':
'f3a1194d-0632-43c6-8e12-7f22518cff87'} (api:129)
.....
2020-11-27 18:37:52,864+0100 INFO (jsonrpc/4) [vdsm.api] FINISH
getVolumeInfo error=(-223, 'Sanlock resource read failure', 'Lease does not
exist on storage') from=::1,60880,
task_id=138a3615-d537-4e5f-a39c-335269ad0917 (api:52)
2020-11-27 18:37:52,864+0100 ERROR (jsonrpc/4) [storage.TaskManager.Task]
(Task='138a3615-d537-4e5f-a39c-335269ad0917') Unexpected error (task:880)
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 887,
in _run
return fn(*args, **kargs)
File "<decorator-gen-159>", line 2, in getVolumeInfo
File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in
method
ret = func(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 3142,
in getVolumeInfo
info = self._produce_volume(sdUUID, imgUUID, volUUID).getInfo()
File "/usr/lib/python3.6/site-packages/vdsm/storage/volume.py", line 258,
in getInfo
leasestatus = self.getLeaseStatus()
File "/usr/lib/python3.6/site-packages/vdsm/storage/volume.py", line 203,
in getLeaseStatus
self.volUUID)
File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 549, in
inquireVolumeLease
return self._domainLock.inquire(lease)
File "/usr/lib/python3.6/site-packages/vdsm/storage/clusterlock.py", line
464, in inquire
sector=self._block_size)
sanlock.SanlockException: (-223, 'Sanlock resource read failure', 'Lease
does not exist on storage')
2020-11-27 18:37:52,865+0100 INFO (jsonrpc/4) [storage.TaskManager.Task]
(Task='138a3615-d537-4e5f-a39c-335269ad0917') aborting: Task is aborted:
"value=(-223, 'Sanlock resource read failure', 'Lease does not exist on
storage') abortedcode=100" (task:1190)
3. supervdsm.log
MainProcess|monitor/de4645f::DEBUG::2020-11-27
18:41:25,286::commands::153::common.commands::(start) /usr/bin/taskset
--cpu-list 0-11 /usr/sbin/dmsetup remove
de4645fc--f379--4837--916b--a0c2b89927d9-dfa4e933--2b9c--4057--a4c5--aa4485b070e9
(cwd None)
MainProcess|monitor/de4645f::DEBUG::2020-11-27
18:41:25,293::commands::98::common.commands::(run) FAILED: <err> =
b'device-mapper: remove ioctl on
de4645fc--f379--4837--916b--a0c2b89927d9-dfa4e933--2b9c--4057--a4c5--aa4485b070e9
failed: Device or resource busy\nCommand failed.\n'; <rc> = 1
MainProcess|monitor/de4645f::ERROR::2020-11-27
18:41:25,294::supervdsm_server::97::SuperVdsm.ServerCallback::(wrapper)
Error in devicemapper_removeMapping
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/storage/devicemapper.py",
line 141, in removeMapping
commands.run(cmd)
File "/usr/lib/python3.6/site-packages/vdsm/common/commands.py", line
101, in run
raise cmdutils.Error(args, p.returncode, out, err)
vdsm.common.cmdutils.Error: Command ['/usr/sbin/dmsetup', 'remove',
'de4645fc--f379--4837--916b--a0c2b89927d9-dfa4e933--2b9c--4057--a4c5--aa4485b070e9']
failed with rc=1 out=b'' err=b'device-mapper: remove ioctl on
de4645fc--f379--4837--916b--a0c2b89927d9-dfa4e933--2b9c--4057--a4c5--aa4485b070e9
failed: Device or resource busy\nCommand failed.\n'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/supervdsm_server.py", line
95, in wrapper
res = func(*args, **kwargs)
File
"/usr/lib/python3.6/site-packages/vdsm/supervdsm_api/devicemapper.py", line
29, in devicemapper_removeMapping
return devicemapper.removeMapping(deviceName)
File "/usr/lib/python3.6/site-packages/vdsm/storage/devicemapper.py",
line 143, in removeMapping
raise Error("Could not remove mapping: {}".format(e))
vdsm.storage.devicemapper.Error: Could not remove mapping: Command
['/usr/sbin/dmsetup', 'remove',
'de4645fc--f379--4837--916b--a0c2b89927d9-dfa4e933--2b9c--4057--a4c5--aa4485b070e9']
failed with rc=1 out=b'' err=b'device-mapper: remove ioctl on
de4645fc--f379--4837--916b--a0c2b89927d9-dfa4e933--2b9c--4057--a4c5--aa4485b070e9
failed: Device or resource busy\nCommand failed.\n'
4. /var/log/messages:
...
Nov 27 18:43:46 host1 journal[20573]: ovirt-ha-agent
ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to
start necessary monitors
Nov 27 18:43:46 host1 journal[20573]: ovirt-ha-agent
ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call
last):#012 File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 85, in start_monitor#012 response = self._proxy.start_monitor(type,
options)#012 File "/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in
__call__#012 return self.__send(self.__name, args)#012 File
"/usr/lib64/python3.6/xmlrpc/client.py", line 1452, in __request#012
verbose=self.__verbose#012 File "/usr/lib64/python3.6/xmlrpc/client.py",
line 1154, in request#012 return self.single_request(host, handler,
request_body, verbose)#012 File "/usr/lib64/python3.6/xmlrpc/client.py",
line 1166, in single_request#012 http_conn = self.send_request(host,
handler, request_body, verbose)#012 File
"/usr/lib64/python3.6/xmlrpc/client.py", line 1279, in send_request#012
self.send_content(connection, request_body)#012 File
"/usr/lib64/python3.6/xmlrpc/client.py", line 1309, in send_content#012
connection.endheaders(request_body)#012 File
"/usr/lib64/python3.6/http/client.py", line 1249, in endheaders#012
self._send_output(message_body, encode_chunked=encode_chunked)#012 File
"/usr/lib64/python3.6/http/client.py", line 1036, in _send_output#012
self.send(msg)#012 File "/usr/lib64/python3.6/http/client.py", line 974,
in send#012 self.connect()#012 File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py",
line 74, in connect#012
self.sock.connect(base64.b16decode(self.host))#012FileNotFoundError:
[Errno 2] No such file or directory#012#012During handling of the above
exception, another exception occurred:#012#012Traceback (most recent call
last):#012 File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
line 131, in _run_agent#012 return action(he)#012 File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
line 55, in action_proper#012 return he.start_monitoring()#012 File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 437, in start_monitoring#012 self._initialize_broker()#012 File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
line 561, in _initialize_broker#012 m.get('options', {}))#012 File
"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 91, in start_monitor#012 ).format(t=type, o=options,
e=e)#012ovirt_hosted_engine_ha.lib.exceptions.RequestError: brokerlink -
failed to start monitor via ovirt-ha-broker: [Errno 2] No such file or
directory, [monitor: 'network', options: {'addr': '10.3.7.1',
'network_test': 'dns', 'tcp_t_address': '', 'tcp_t_port': ''}]
Nov 27 18:43:46 host1 journal[20573]: ovirt-ha-agent
ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
Nov 27 18:43:46 host1 systemd[1]: ovirt-ha-agent.service: Main process
exited, code=exited, status=157/n/a
Nov 27 18:43:46 host1 systemd[1]: ovirt-ha-agent.service: Failed with
result 'exit-code'.
....
Please, let me know if other logs are needed.
Thank you
--
Ai sensi dell'articolo 13 del Regolamento UE, n. 2016/679 (GDPR) si informa
che Titolare del trattamento dei Suoi dati personali, anche particolari,
compreso l'indirizzo di posta elettronica, è la EXTRAORDY S.r.l.. E'
possibile revocare il consenso in qualsiasi momento senza pregiudicare la
liceità del trattamento basata sul consenso prestato prima della revoca,
nonché proporre reclamo all'Autorità di controllo. Agli interessati sono
riconosciuti i diritti di cui agli artt. 15 ss. del Regolamento UE, n.
2016/679 e in particolare di chiedere al titolare del trattamento l'accesso
ai dati personali e la rettifica o la cancellazione degli stessi o la
limitazione del trattamento o la portabilità dei dati che lo riguardano o
di opporsi al loro trattamento rivolgendo le richieste inviando un
messaggio al seguente indirizzo e-mail privacy(a)extraordy.com.
4 years
Can not connect to gluster storage
by Stefan Wolf
Hello,
I ve a host that can not connet to gluster storage.
It has worked since I ve set up the environment, and today it stoped working
this are the error messages in the webui
The error message for connection kvm380.durchhalten.intern:/data returned by VDSM was: Failed to fetch Gluster Volume List
Failed to connect Host kvm380.durchhalten.intern to the Storage Domains data.
Failed to connect Host kvm380.durchhalten.intern to the Storage Domains hosted_storage.
and here the vdsm.log
StorageDomainDoesNotExist: Storage domain does not exist: (u'36663740-576a-4498-b28e-0a402628c6a7',)
2020-11-27 12:59:07,665+0000 INFO (jsonrpc/2) [storage.TaskManager.Task] (Task='8bed48b8-0696-4d3f-966a-119219f3b013') aborting: Task is aborted: "Storage domain does not exist: (u'36663740-576a-4498-b28e-0a402628c6a7',)" - code 358 (task:1181)
2020-11-27 12:59:07,665+0000 ERROR (jsonrpc/2) [storage.Dispatcher] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'36663740-576a-4498-b28e-0a402628c6a7',) (dispatcher:83)
2020-11-27 12:59:07,666+0000 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call StorageDomain.getInfo failed (error 358) in 0.38 seconds (__init__:312)
2020-11-27 12:59:07,698+0000 INFO (jsonrpc/7) [vdsm.api] START connectStorageServer(domType=7, spUUID=u'00000000-0000-0000-0000-000000000000', conList=[{u'id': u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d', u'vfs_type': u'glusterfs', u'connection': u'kvm380.durchhalten.intern:/engine', u'user': u'kvm'}], options=None) from=::1,40964, task_id=3a3eeb80-50ef-4710-a4f4-9d35da2ff281 (api:48)
2020-11-27 12:59:07,871+0000 ERROR (jsonrpc/7) [storage.HSM] Could not connect to storageServer (hsm:2420)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2417, in connectStorageServer
conObj.connect()
File "/usr/lib/python2.7/site-packages/vdsm/storage/storageServer.py", line 167, in connect
self.validate()
File "/usr/lib/python2.7/site-packages/vdsm/storage/storageServer.py", line 297, in validate
if not self.volinfo:
File "/usr/lib/python2.7/site-packages/vdsm/storage/storageServer.py", line 284, in volinfo
self._volinfo = self._get_gluster_volinfo()
File "/usr/lib/python2.7/site-packages/vdsm/storage/storageServer.py", line 329, in _get_gluster_volinfo
self._volfileserver)
File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 56, in __call__
return callMethod()
File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 54, in <lambda>
**kwargs)
File "<string>", line 2, in glusterVolumeInfo
File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod
raise convert_to_error(kind, result)
GlusterVolumesListFailedException: Volume list failed: rc=30806 out=() err=['Volume does not exist']
2020-11-27 12:59:07,871+0000 INFO (jsonrpc/7) [vdsm.api] FINISH connectStorageServer return={'statuslist': [{'status': 4149, 'id': u'e29cf818-5ee5-46e1-85c1-8aeefa33e95d'}]} from=::1,40964, task_id=3a3eeb80-50ef-4710-a4f4-9d35da2ff281 (api:54)
2020-11-27 12:59:07,871+0000 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call StoragePool.connectStorageServer succeeded in 0.18 seconds (__init__:312)
2020-11-27 12:59:08,474+0000 INFO (Reactor thread) [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:40966 (protocoldetector:61)
2020-11-27 12:59:08,484+0000 INFO (Reactor thread) [ProtocolDetector.Detector] Detected protocol stomp from ::1:40966 (protocoldetector:125)
2020-11-27 12:59:08,484+0000 INFO (Reactor thread) [Broker.StompAdapter] Processing CONNECT request (stompserver:95)
2020-11-27 12:59:08,485+0000 INFO (JsonRpc (StompReactor)) [Broker.StompAdapter] Subscribe command received (stompserver:124)
2020-11-27 12:59:08,525+0000 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.ping2 succeeded in 0.00 seconds (__init__:312)
2020-11-27 12:59:08,529+0000 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Host.ping2 succeeded in 0.00 seconds (__init__:312)
2020-11-27 12:59:08,533+0000 INFO (jsonrpc/6) [vdsm.api] START getStorageDomainInfo(sdUUID=u'36663740-576a-4498-b28e-0a402628c6a7', options=None) from=::1,40966, task_id=ee3ac98e-6a93-4cb2-a626-5533c8fb78ad (api:48)
2020-11-27 12:59:08,909+0000 INFO (jsonrpc/6) [vdsm.api] FINISH getStorageDomainInfo error=Storage domain does not exist: (u'36663740-576a-4498-b28e-0a402628c6a7',) from=::1,40966, task_id=ee3ac98e-6a93-4cb2-a626-5533c8fb78ad (api:52)
2020-11-27 12:59:08,910+0000 ERROR (jsonrpc/6) [storage.TaskManager.Task] (Task='ee3ac98e-6a93-4cb2-a626-5533c8fb78ad') Unexpected error (task:875)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
return fn(*args, **kargs)
File "<string>", line 2, in getStorageDomainInfo
File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method
ret = func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2753, in getStorageDomainInfo
dom = self.validateSdUUID(sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 305, in validateSdUUID
sdDom = sdCache.produce(sdUUID=sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 110, in produce
domain.getRealDomain()
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 51, in getRealDomain
return self._cache._realProduce(self._sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 134, in _realProduce
domain = self._findDomain(sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 151, in _findDomain
return findMethod(sdUUID)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sdc.py", line 176, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: (u'36663740-576a-4498-b28e-0a402628c6a7',)
can somebody tell me where the problem is and how can it solved
4 years
[ANN] oVirt 4.4.4 Third Release Candidate is now available for testing
by Sandro Bonazzola
oVirt 4.4.4 Third Release Candidate is now available for testing
The oVirt Project is pleased to announce the availability of oVirt 4.4.4
Third Release Candidate for testing, as of November 26th, 2020.
This update is the fourth in a series of stabilization updates to the 4.4
series.
How to prevent hosts entering emergency mode after upgrade from oVirt 4.4.1
Note: Upgrading from 4.4.2 GA or later should not require re-doing these
steps, if already performed while upgrading from 4.4.1 to 4.4.2 GA. These
are only required to be done once.
Due to Bug 1837864 <https://bugzilla.redhat.com/show_bug.cgi?id=1837864> -
Host enter emergency mode after upgrading to latest build
If you have your root file system on a multipath device on your hosts you
should be aware that after upgrading from 4.4.1 to 4.4.4 you may get your
host entering emergency mode.
In order to prevent this be sure to upgrade oVirt Engine first, then on
your hosts:
1.
Remove the current lvm filter while still on 4.4.1, or in emergency mode
(if rebooted).
2.
Reboot.
3.
Upgrade to 4.4.4 (redeploy in case of already being on 4.4.4).
4.
Run vdsm-tool config-lvm-filter to confirm there is a new filter in
place.
5.
Only if not using oVirt Node:
- run "dracut --force --add multipath” to rebuild initramfs with the
correct filter configuration
6.
Reboot.
Documentation
-
If you want to try oVirt as quickly as possible, follow the instructions
on the Download <https://ovirt.org/download/> page.
-
For complete installation, administration, and usage instructions, see
the oVirt Documentation <https://ovirt.org/documentation/>.
-
For upgrading from a previous version, see the oVirt Upgrade Guide
<https://ovirt.org/documentation/upgrade_guide/>.
-
For a general overview of oVirt, see About oVirt
<https://ovirt.org/community/about.html>.
Important notes before you try it
Please note this is a pre-release build.
The oVirt Project makes no guarantees as to its suitability or usefulness.
This pre-release must not be used in production.
Installation instructions
For installation instructions and additional information please refer to:
https://ovirt.org/documentation/
This release is available now on x86_64 architecture for:
* Red Hat Enterprise Linux 8.2 or newer
* CentOS Linux (or similar) 8.2 or newer
This release supports Hypervisor Hosts on x86_64 and ppc64le architectures
for:
* Red Hat Enterprise Linux 8.2 or newer
* CentOS Linux (or similar) 8.2 or newer
* oVirt Node 4.4 based on CentOS Linux 8.2 (available for x86_64 only)
See the release notes [1] for installation instructions and a list of new
features and bugs fixed.
Notes:
- oVirt Appliance is already available for CentOS Linux 8
- oVirt Node NG is already available for CentOS Linux 8
Additional Resources:
* Read more about the oVirt 4.4.4 release highlights:
http://www.ovirt.org/release/4.4.4/
* Get more oVirt project updates on Twitter: https://twitter.com/ovirt
* Check out the latest project news on the oVirt blog:
http://www.ovirt.org/blog/
[1] http://www.ovirt.org/release/4.4.4/
[2] http://resources.ovirt.org/pub/ovirt-4.4-pre/iso/
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA <https://www.redhat.com/>
sbonazzo(a)redhat.com
<https://www.redhat.com/>
*Red Hat respects your work life balance. Therefore there is no need to
answer this email out of your office hours.*
4 years
"gluster-ansible-roles is not installed on Host" error on Cockpit
by Hesham Ahmed
On a new 4.3.1 oVirt Node installation, when trying to deploy HCI
(also when trying adding a new gluster volume to existing clusters)
using Cockpit, an error is displayed "gluster-ansible-roles is not
installed on Host. To continue deployment, please install
gluster-ansible-roles on Host and try again". There is no package
named gluster-ansible-roles in the repositories:
[root@localhost ~]# yum install gluster-ansible-roles
Loaded plugins: enabled_repos_upload, fastestmirror, imgbased-persist,
package_upload, product-id, search-disabled-repos,
subscription-manager, vdsmupgrade
This system is not registered with an entitlement server. You can use
subscription-manager to register.
Loading mirror speeds from cached hostfile
* ovirt-4.3-epel: mirror.horizon.vn
No package gluster-ansible-roles available.
Error: Nothing to do
Uploading Enabled Repositories Report
Cannot upload enabled repos report, is this client registered?
This is due to check introduced here:
https://gerrit.ovirt.org/#/c/98023/1/dashboard/src/helpers/AnsibleUtil.js
Changing the line from:
[ "rpm", "-qa", "gluster-ansible-roles" ], { "superuser":"require" }
to
[ "rpm", "-qa", "gluster-ansible" ], { "superuser":"require" }
resolves the issue. The above code snippet is installed at
/usr/share/cockpit/ovirt-dashboard/app.js on oVirt node and can be
patched by running "sed -i 's/gluster-ansible-roles/gluster-ansible/g'
/usr/share/cockpit/ovirt-dashboard/app.js && systemctl restart
cockpit"
4 years
VM memory decrease
by Erez Zarum
I have an 8 node cluster running oVirt 4.4.2, i have noticed lately that some VMs started to have their memory decrease.
For example a VM that was configured to have 32GB memory without any notice were had their memory decrease to about 4GB, if i restart the VM the VM comes up with the correct memory but shortly after it decreases again.
The VM configuration has:
Memory Size: 32768 MB
Maximum Memory: 131072 MB
Physical Memory Guaranteed: 1024 MB
It's worth mentioning that those are Linux VMs which are provisioned through Foreman, Windows VMs are not provisioned from Foreman and their "Memory Size" is identical to "Physical Memory Guaranteed", so far i have not observed a Windows VM with this behaviour, though there might be some that i haven't caught.
Is there a possibility the cause for this issue is by having "Physical Memory Guaranteed" set to this low?
4 years