Email notification test
by Alex K
Hi all,
Does ovirt provide a way to test the email SMTP settings for email
notifications? (apart from telnet)
Thanx,
Alex
6 years
Gluster hooks
by Alex K
Hi all,
I see the following gluster hooks at ovirt cluster enabled:
[image: image.png]
What is their purpose? Do they need to be all enabled?
Thanx,
Alex
6 years
Try to add Host to cluster: Command returned failure code 1 during SSH session
by Stefan Wolf
Hello,
I try to add a new installed ovirt node and get the error message during
adding to cluster
Host kvm380 installation failed. Command returned failure code 1 during SSH
session 'root(a)kvm380.durchhalten.intern'.
Maybe someone can help me
Thx
Here is the logfile
2018-12-09 14:48:23,262+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Stage: Initializing.
2018-12-09 14:48:23,414+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Stage: Environment
setup.
2018-12-09 14:48:23,479+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Stage: Environment
packages setup.
2018-12-09 14:48:26,194+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Stage: Programs
detection.
2018-12-09 14:48:26,373+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Stage: Environment
customization.
2018-12-09 14:48:26,842+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Kdump supported.
2018-12-09 14:48:27,123+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Logs at host located
at: '/tmp/ovirt-host-deploy-20181209144822-hbil3q.log'.
2018-12-09 14:48:27,188+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Host is hypervisor.
2018-12-09 14:48:27,192+01 INFO
[org.ovirt.engine.core.bll.hostdeploy.VdsDeployVdsmUnit] (VdsDeploy)
[f097f8b0-ee35-4c08-a416-1f0427dd2e9e] Host kvm380.durchhalten.intern
reports unique id 31323436-3530-5a43-3233-303430374a4e
2018-12-09 14:48:27,208+01 INFO
[org.ovirt.engine.core.bll.hostdeploy.VdsDeployVdsmUnit] (VdsDeploy)
[f097f8b0-ee35-4c08-a416-1f0427dd2e9e] Assigning unique id
31323436-3530-5a43-3233-303430374a4e to Host kvm380.durchhalten.intern
2018-12-09 14:48:27,436+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Stage: Setup
validation.
2018-12-09 14:48:27,825+01 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS_ERROR(511), An error has occurred during
installation of Host kvm380: Failed to execute stage 'Setup validation':
Cannot resolve kdump destination address 'ovirt.durchhalten.intern'.
2018-12-09 14:48:27,827+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Stage: Clean up.
2018-12-09 14:48:27,830+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Stage:
Pre-termination.
2018-12-09 14:48:27,874+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Retrieving
installation logs to:
'/var/log/ovirt-engine/host-deploy/ovirt-host-deploy-20181209144827-kvm380.d
urchhalten.intern-f097f8b0-ee35-4c08-a416-1f0427dd2e9e.log'.
2018-12-09 14:48:28,295+01 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(VdsDeploy) [f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID:
VDS_INSTALL_IN_PROGRESS(509), Installing Host kvm380. Stage: Termination.
2018-12-09 14:48:28,389+01 ERROR
[org.ovirt.engine.core.uutils.ssh.SSHDialog]
(EE-ManagedThreadFactory-engine-Thread-1942)
[f097f8b0-ee35-4c08-a416-1f0427dd2e9e] SSH error running command
root(a)kvm380.durchhalten.intern:'umask 0077;
MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXXXXXXX)"; trap
"chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" >
/dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}" -x &&
"${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine
DIALOG/customization=bool:True': IOException: Command returned failure code
1 during SSH session 'root(a)kvm380.durchhalten.intern'
2018-12-09 14:48:28,389+01 ERROR
[org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
(EE-ManagedThreadFactory-engine-Thread-1942)
[f097f8b0-ee35-4c08-a416-1f0427dd2e9e] Error during host
kvm380.durchhalten.intern install
2018-12-09 14:48:28,391+01 ERROR
[org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
(EE-ManagedThreadFactory-engine-Thread-1942)
[f097f8b0-ee35-4c08-a416-1f0427dd2e9e] Host installation failed for host
'c06e2a1b-7d94-45f7-aaab-21da6703d6fc', 'kvm380': Command returned failure
code 1 during SSH session 'root(a)kvm380.durchhalten.intern'
2018-12-09 14:48:28,393+01 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-1942)
[f097f8b0-ee35-4c08-a416-1f0427dd2e9e] START,
SetVdsStatusVDSCommand(HostName = kvm380,
SetVdsStatusVDSCommandParameters:{hostId='c06e2a1b-7d94-45f7-aaab-21da6703d6
fc', status='InstallFailed', nonOperationalReason='NONE',
stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 29ddff69
2018-12-09 14:48:28,395+01 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-engine-Thread-1942)
[f097f8b0-ee35-4c08-a416-1f0427dd2e9e] FINISH, SetVdsStatusVDSCommand, log
id: 29ddff69
2018-12-09 14:48:28,414+01 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engine-Thread-1942)
[f097f8b0-ee35-4c08-a416-1f0427dd2e9e] EVENT_ID: VDS_INSTALL_FAILED(505),
Host kvm380 installation failed. Command returned failure code 1 during SSH
session 'root(a)kvm380.durchhalten.intern'.
2018-12-09 14:48:28,418+01 INFO
[org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
(EE-ManagedThreadFactory-engine-Thread-1942)
[f097f8b0-ee35-4c08-a416-1f0427dd2e9e] Lock freed to object
'EngineLock:{exclusiveLocks='[c06e2a1b-7d94-45f7-aaab-21da6703d6fc=VDS]',
sharedLocks=''}'
6 years
Re: ovirt 4.2.7.1 fails to deploy hosted engine on GlusterFS
by Johannes Jurgens Loots
I had this issue this week as well.
When asked about the glusterfs that you self provisioned you stated "ovirt1.localdomain:/gluster_bricks/engine”
So I am new to gluster but as a client of gluster you can only refer to it via volume name.
host:/<volume name>
Hence maybe try ovirt1.localdomain:/engine.
That did the trick for me. Hope this helps. Cost me a week.
Jurie
6 years
Strange migration issue
by Demeter Tibor
Dear All,
I've using ovirt from 3.4 and we have a strange, rarely problem with live migrations.
- In most of case impossible to do a live migrate of VMs that have a lot of memory (16 gigs and more up to 40 GB) .
- Those are need to try and try until the migration will be successful
- impossible to understand why does working in a case and doesn't in another
- it is happening always if there bigger load on vms, but only on big memory VM-s
- The small VMs (2-4-8 gigs) can migrate without problems
- In very rarely cases possible to migrate on first try.
- We using the memory ballooning function on all of VMs.
We using dedicated migration network with separated vlans, but on same physical lan (1gbe) with ovirtmgmt.
At this moment, we don't use QoS for dedicate bandwidth.
What can I do ?
Thanks in advance,
Regards,
Tibor
6 years
Removing stale unused disks from Ovirt Storage domains?
by Jacob Green
Hello, after migrating our Fiber Channel to a new Ovirt 4.2
environment, we over the years and through experimenting and mistakes
have ended up with some stale LVM's/disks on our fibre channel storage
that Ovirt no longer is able to manage correctly or is unaware of. I am
looking for a reliable to way to do a few different things.
The first is figuring out precisely what LVM ids belong to a VM and is
being used by that VM.
The second is figuring out if a LVM I have found on the Storage domain
is being used at all by any VM or if ovirt is even aware of it.
I have fumbled around a bit. And using a combination of the following I
have been able to figure out some of them. But now I am finding
information that does not match or may not be correct or I am
interpreting the data wrong. Anyway this is a big deal, because we want
to remove the stale unused LVMs and it would obviously be disastrous if
I deleted the wrong LVM from the FC.
So I know its not recommend but since I am not actually telling
vdsm-client to do anything other than get information I figure its
harmless. So here is what I have found so far.
*vdsm-client Host getVMFullList vmname=<VM_NAME_HERE> | grep volumeID*
*lvs | grep <noted volumeID here>*
With the above I have had some success verifying what LVMs a device is
using. However now I am having trouble figuring out a particular windows
server VMs LVM id's.
what I want to know is there a better way?
Also two more things.
You could call these feature requests, however it would be nice if there
was a way to see all the unused LVMs on a storage domain that are not
Tied to a VM. And it would also be nice to be able to remove un-imported
VMs that reside on a storage domain without importing them.
Anyway trying to get rid of un-imported vms and getting rid of unused
LVMs has been a chore. I wish there was an easier way.
--
Jacob Green
Systems Admin
American Alloy Steel
713-300-5690
6 years
VM ramdomly unresponsive
by fsoyer
Hi all,
I continue to try to understand my problem between (I suppose) oVirt anf Gluster.
After my recents posts titled 'VMs unexpectidly restarted' that did not provide solution nor search idea, I submit to you another (related ?) problem.
Parallely with the problem of VMs down (that did not reproduce since Oct 16), I have ramdomly some events in the GUI saying "VM xxxxx is not responding." For example, VM "patjoub1" on 2018-11-11 14:34. Never the same hour, not all the days, often this VM patjoub1 but not always : I had it on two others. All VMs disks are on a volume DATA02 (with leases on the same volume).
Searching in engine.log, I found :
2018-11-11 14:34:32,953+01 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-28) [] VM '6116fb07-096b-4c7e-97fe-01ecc9a6bd9b'(patjoub1) moved from 'Up' --> 'NotResponding'
2018-11-11 14:34:33,116+01 WARN [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedThreadFactory-engineScheduled-Thread-1) [] Invalid or unknown guest architecture type '' received from guest agent
2018-11-11 14:34:33,176+01 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-28) [] EVENT_ID: VM_NOT_RESPONDING(126), VM patjoub1 is not responding.
...
...
2018-11-11 14:34:48,278+01 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-48) [] VM '6116fb07-096b-4c7e-97fe-01ecc9a6bd9b'(patjoub1) moved from 'NotResponding' --> 'Up'So it becomes up 15s after, and the VM (and the monitoring) see no downtime.
At this time, I see in vdsm.log of the nodes :
2018-11-11 14:33:49,450+0100 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/glusterSD/victor.local.systea.fr:_DATA02/ffc53fd8-c5d1-4070-ae51-2e91835cd937/dom_md/metadata (monitor:498)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 496, in _pathChecked
delay = result.delay()
File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line 391, in delay
raise exception.MiscFileReadException(self.path, self.rc, self.err)
MiscFileReadException: Internal file read failure: (u'/rhev/data-center/mnt/glusterSD/victor.local.systea.fr:_DATA02/ffc53fd8-c5d1-4070-ae51-2e91835cd937/dom_md/metadata', 1, 'Read timeout')
2018-11-11 14:33:49,450+0100 INFO (check/loop) [storage.Monitor] Domain ffc53fd8-c5d1-4070-ae51-2e91835cd937 became INVALID (monitor:469)
2018-11-11 14:33:59,451+0100 WARN (check/loop) [storage.check] Checker u'/rhev/data-center/mnt/glusterSD/victor.local.systea.fr:_DATA02/ffc53fd8-c5d1-4070-ae51-2e91835cd937/dom_md/metadata' is blocked for 20.00 seconds (check:282)
2018-11-11 14:34:09,480+0100 INFO (event/37) [storage.StoragePool] Linking /rhev/data-center/mnt/glusterSD/victor.local.systea.fr:_DATA02/ffc53fd8-c5d1-4070-ae51-2e91835cd937 to /rhev/data-center/6efda7f8-b62f-11e8-9d16-00163e263d21/ffc53fd8-c5d1-4070-ae51-2e91835cd937 (sp:1230)OK : so, DATA02 marked as blocked for 20s ? I definitly have a problem with gluster ? I'll inevitably find the reason in the gluster logs ? Uh : not at all.
Please see gluster logs here :
https://seafile.systea.fr/d/65df86cca9d34061a1e4/
Unfortunatly I discovered this morning that I have not the sanlock.log for this date. I don't understand why, the log rotate seems OK with "rotate 3", but I have no backups files :(.
But, luck in bad luck, the same event occurs this morning ! Same VM patjoub1, 2018-11-13 08:01:37. So I have added the sanlock.log for today, maybe it can help.
IMPORTANT NOTE : don't forget that Gluster log with on hour shift. For this event at 14:34, search at 13h34 in gluster logs.
I recall my configuration :
Gluster 3.12.13
oVirt 4.2.3
3 nodes where the third is arbiter (volumes in replica 2)
The nodes are never overloaded (CPU average 5%, no peak detected at the time of the event, mem 128G used at 15% (only 10 VMs on this cluster)). Network underused, gluster is on a separate network on a bond (2 NICs) 1+1Gb mode 4 = 2Gb, used in peak at 10%.
Here is the configuration for the given volume :
# gluster volume status DATA02
Status of volume: DATA02
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick victorstorage.local.systea.fr:/home/d
ata02/data02/brick 49158 0 Y 4990
Brick gingerstorage.local.systea.fr:/home/d
ata02/data02/brick 49153 0 Y 8460
Brick eskarinastorage.local.systea.fr:/home
/data01/data02/brick 49158 0 Y 2470
Self-heal Daemon on localhost N/A N/A Y 8771
Self-heal Daemon on eskarinastorage.local.s
ystea.fr N/A N/A Y 11745
Self-heal Daemon on victorstorage.local.sys
tea.fr N/A N/A Y 17055
Task Status of Volume DATA02
------------------------------------------------------------------------------
There are no active volume tasks
# gluster volume info DATA02
Volume Name: DATA02
Type: Replicate
Volume ID: 48bf5871-339b-4f39-bea5-9b5848809c83
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: victorstorage.local.systea.fr:/home/data02/data02/brick
Brick2: gingerstorage.local.systea.fr:/home/data02/data02/brick
Brick3: eskarinastorage.local.systea.fr:/home/data01/data02/brick (arbiter)
Options Reconfigured:
network.ping-timeout: 30
server.allow-insecure: on
cluster.granular-entry-heal: enable
features.shard-block-size: 64MB
performance.stat-prefetch: on
server.event-threads: 3
client.event-threads: 8
performance.io-thread-count: 32
storage.owner-gid: 36
storage.owner-uid: 36
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
network.remote-dio: enable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
cluster.server-quorum-ratio: 51%
So : is there someone around trying to make me understand what append ? Pleeease :/
--
Regards,
Frank
6 years
Re: Hypconverged Gluster Setup hangs after TLS/SSL
by Sahina Bose
On Fri, 7 Dec 2018 at 5:35 PM, Matthias Barmeier <
matthias.barmeier(a)sourcepark.de> wrote:
> I don't think so because the known_hosts contain all host keys of all
> gluster nodes before start of setup.
Also, the FQDN that’s used to add the hosts to engine? Is that also in
known_hosts?
>
> Or do I miss something ?
>
>
> Am Freitag, den 07.12.2018, 12:17 +0530 schrieb Sahina Bose:
> > I think you may be running into
> > https://bugzilla.redhat.com/show_bug.cgi?id=1651516
> >
> > On Thu, Dec 6, 2018 at 7:30 PM <matthias.barmeier(a)sourcepark.de>
> > wrote:
> > >
> > > Hi,
> > >
> > > tried to setup hyperconverged with glusterfs. I used three i7 with
> > > two 1TB Disks and two NICs. Everythin worked fine till:
> > >
> > > [ INFO ] TASK [Set Engine public key as authorized key without
> > > validating the TLS/SSL certificates]
> > >
> > > appears in the wizards window. Does anyone has a hint on what to
> > > do?
> > > _______________________________________________
> > > Users mailing list -- users(a)ovirt.org
> > > To unsubscribe send an email to users-leave(a)ovirt.org
> > > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > > oVirt Code of Conduct: https://www.ovirt.org/community/about/commun
> > > ity-guidelines/
> > > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.or
> > > g/message/BI6Z576SKSTNQZRBCWLYMVFYTRZVZOCE/
>
6 years
DATA Domain with 1GB limit
by mohamedbehbity@gmail.com
Once the attached the domain to a data center, the real size of the DAT domain is displayed as 1 GB
6 years
Hypconverged Gluster Setup hangs after TLS/SSL
by matthias.barmeier@sourcepark.de
Hi,
tried to setup hyperconverged with glusterfs. I used three i7 with two 1TB Disks and two NICs. Everythin worked fine till:
[ INFO ] TASK [Set Engine public key as authorized key without validating the TLS/SSL certificates]
appears in the wizards window. Does anyone has a hint on what to do?
6 years