losing ib0 connection after activating host
by Douglas Duckworth
Hi
I keep losing ib0 connection on hypervisor after adding host to engine.
This makes the host not really work since NFS will be mounted over ib0.
I don't really understand why this occurs.
OS:
[root@ovirt-hv2 ~]# cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)
Here's the network script:
[root@ovirt-hv2 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ib0
DEVICE=ib0
BOOTPROTO=static
IPADDR=172.16.0.207
NETMASK=255.255.255.0
ONBOOT=yes
ZONE=public
When I try "ifup"
[root@ovirt-hv2 ~]# ifup ib0
Error: Connection activation failed: No suitable device found for this
connection.
The error in syslog:
Aug 22 11:31:50 ovirt-hv2 kernel: IPv4: martian source 172.16.0.87 from
172.16.0.49, on dev ib0
Aug 22 11:31:53 ovirt-hv2 NetworkManager[1070]: <info> [1534951913.7486]
audit: op="connection-activate" uuid="2ab4abde-b8a5-6cbc-19b1-2bfb193e4e89"
name="System ib0" result="fail" reason="No suitable device found for this
connection.
As you can see media state up:
[root@ovirt-hv2 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group
default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master
ovirtmgmt state UP group default qlen 1000
link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff
3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN
group default qlen 1000
link/ether 50:9a:4c:89:d3:82 brd ff:ff:ff:ff:ff:ff
4: p1p1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN
group default qlen 1000
link/ether b4:96:91:13:ea:68 brd ff:ff:ff:ff:ff:ff
5: p1p2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN
group default qlen 1000
link/ether b4:96:91:13:ea:6a brd ff:ff:ff:ff:ff:ff
6: idrac: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state
UNKNOWN group default qlen 1000
link/ether 50:9a:4c:89:d3:84 brd ff:ff:ff:ff:ff:ff
inet 169.254.0.2/16 brd 169.254.255.255 scope global idrac
valid_lft forever preferred_lft forever
7: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group
default qlen 256
link/infiniband
a0:00:02:08:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:1d:13:41 brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
8: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
default qlen 1000
link/ether 12:b4:30:22:39:5b brd ff:ff:ff:ff:ff:ff
9: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
default qlen 1000
link/ether 3e:32:e6:66:98:49 brd ff:ff:ff:ff:ff:ff
25: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP group default qlen 1000
link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.183/16 brd 10.0.255.255 scope global ovirtmgmt
valid_lft forever preferred_lft forever
26: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc
noqueue master ovs-system state UNKNOWN group default qlen 1000
link/ether aa:32:82:1b:01:d9 brd ff:ff:ff:ff:ff:ff
27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
default qlen 1000
link/ether 32:ff:5d:b8:c2:b4 brd ff:ff:ff:ff:ff:ff
The card is FDR:
[root@ovirt-hv2 ~]# lspci -v | grep Mellanox
01:00.0 Network controller: Mellanox Technologies MT27500 Family
[ConnectX-3]
Subsystem: Mellanox Technologies Device 0051
Latest OFED driver:
[root@ovirt-hv2 ~]# /etc/init.d/openibd status
HCA driver loaded
Configured IPoIB devices:
ib0
Currently active IPoIB devices:
ib0
Configured Mellanox EN devices:
Currently active Mellanox devices:
ib0
The following OFED modules are loaded:
rdma_ucm
rdma_cm
ib_ipoib
mlx4_core
mlx4_ib
mlx4_en
mlx5_core
mlx5_ib
ib_uverbs
ib_umad
ib_ucm
ib_cm
ib_core
mlxfw
mlx5_fpga_tools
I can add an IP to ib0 using "ip addr" though I need Network Manager to
work with ib0.
Thanks,
Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Weill Cornell Medicine
1300 York - LC-502
E: doug(a)med.cornell.edu
O: 212-746-6305
F: 212-746-8690
6 years, 3 months
ovirt-node-ng-image-update 4.2.4 to 4.2.5.1 fails
by Glenn Farmer
yum update ends with:
warning: %post(ovirt-node-ng-image-update-4.2.5.1-1.el7.noarch) scriptlet failed, exit status 1
Non-fatal POSTIN scriptlet failure in rpm package ovirt-node-ng-image-update-4.2.5.1-1.el7.noarch
It creates the layers:
ovirt-node-ng-4.2.5.1-0.20180731.0 onn Vri---tz-k 6.00g pool00
ovirt-node-ng-4.2.5.1-0.20180731.0+1 onn Vwi-a-tz-- 6.00g pool00 ovirt-node-ng-4.2.5.1-0.20180731.0
But no grub2 boot entry.
nodectl info:
layers:
ovirt-node-ng-4.2.4-0.20180626.0:
ovirt-node-ng-4.2.4-0.20180626.0+1
ovirt-node-ng-4.2.5.1-0.20180731.0:
ovirt-node-ng-4.2.5.1-0.20180731.0+1
ovirt-node-ng-4.2.2-0.20180405.0:
ovirt-node-ng-4.2.2-0.20180405.0+1
bootloader:
default: ovirt-node-ng-4.2.4-0.20180626.0+1
entries:
ovirt-node-ng-4.2.2-0.20180405.0+1:
index: 1
title: ovirt-node-ng-4.2.2-0.20180405.0+1
kernel: /boot/ovirt-node-ng-4.2.2-0.20180405.0+1/vmlinuz-3.10.0-693.21.1.el7.x86_64
args: "ro crashkernel=auto rd.lvm.lv=onn/ovirt-node-ng-4.2.2-0.20180405.0+1 img.bootid=ovirt-node-ng-4.2.2-0.20180405.0+1"
initrd: /boot/ovirt-node-ng-4.2.2-0.20180405.0+1/initramfs-3.10.0-693.21.1.el7.x86_64.img
root: /dev/onn/ovirt-node-ng-4.2.2-0.20180405.0+1
ovirt-node-ng-4.2.4-0.20180626.0+1:
index: 0
title: ovirt-node-ng-4.2.4-0.20180626.0+1
kernel: /boot/ovirt-node-ng-4.2.4-0.20180626.0+1/vmlinuz-3.10.0-862.3.3.el7.x86_64
args: "ro crashkernel=auto rd.lvm.lv=onn/ovirt-node-ng-4.2.4-0.20180626.0+1 img.bootid=ovirt-node-ng-4.2.4-0.20180626.0+1"
initrd: /boot/ovirt-node-ng-4.2.4-0.20180626.0+1/initramfs-3.10.0-862.3.3.el7.x86_64.img
root: /dev/onn/ovirt-node-ng-4.2.4-0.20180626.0+1
current_layer: ovirt-node-ng-4.2.4-0.20180626.0+1
Just posting for others that might have the same issue.
6 years, 3 months
HostedEngine cannot migrate to added hosts
by Ariez Ahito
i followed the instruction on how to add hosts to Hosted-engine deployment.
but when i tried to migrate the hostedEngine virtual machine. its says "no availble host to migrate to"
upon checking on the added host
Hosted Engine HA: not active
systemctl status vdsmd
● vdsmd.service - Virtual Desktop Server Manager
Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2018-08-20 01:14:19 EDT; 56min ago
Process: 26795 ExecStopPost=/usr/libexec/vdsm/vdsmd_init_common.sh --post-stop (code=exited, status=0/SUCCESS)
Process: 26799 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)
Main PID: 26879 (vdsmd)
Tasks: 46
CGroup: /system.slice/vdsmd.service
└─26879 /usr/bin/python2 /usr/share/vdsm/vdsmd
Aug 20 02:09:42 ovirt-node5 vdsm[26879]: ERROR failed to retrieve Hosted Engine HA score '[Errno 2] No such file or directory'Is the Hosted Engine setup finished?
Aug 20 02:09:43 ovirt-node5 vdsm[26879]: ERROR failed to retrieve Hosted Engine HA score '[Errno 2] No such file or directory'Is the Hosted Engine setup finished?
vdsm.log
2018-08-20 02:11:28,548-0400 ERROR (periodic/0) [root] failed to retrieve Hosted Engine HA score '[Errno 2] No such file or directory'Is the Hosted Engine setup finished? (api:196)
6 years, 3 months
Anyone have any luck with Bacchus?
by Wesley Stewart
I am trying to get Bacchus installed:
https://github.com/openbacchus/bacchus
It seems he has upgraded to installing via an ansible playbook. I can get
this to run all the way through, hopefully NGINX either ends up responding
with a "Gateway error" or it will give me the default nginx landing page.
I have tried with a RHEL7 and CentOS7 VM, as well as in a CentOS\systemd
docker container, all of which having about the same amount of luck. I was
just curious if anyone else has gotten this to work recently.
6 years, 3 months
Re: Can't start the VM ... There are no hosts to use. Check that the cluster contains at least one host in Up state.
by Arik Hadas
On Wed, Aug 22, 2018 at 5:28 PM Daniel Renaud <daniel(a)djj-consultants.com>
wrote:
> No more VM in problem since I remove the CPU type , etc. Theses values
> came back the same as they was but no more problem starting the VMs after
>
I see.
That's a pity because we cannot figure out what actually lead to that NPE
that I saw in the engine log.
But glad to see that it works for you now. Thanks.
>
> 2018-08-22 9:58 GMT-04:00 Arik Hadas <ahadas(a)redhat.com>:
>
>>
>>
>> On Wed, Aug 22, 2018 at 4:51 PM Daniel Renaud <daniel(a)djj-consultants.com>
>> wrote:
>>
>>> I checked for the unmanaged interface and they was all ok in the 3 vm
>>>
>>
>> And do you still have a VM that experieces such failures?
>>
>>
>>>
>>> --
>>> Daniel Renaud
>>> Analyste Sécurité & Système
>>> DJJ Consultants, http://www.djj-consultants.com
>>> Téléphone : 418-907-9530, poste 200
>>> GPG key : 0x7189C9D0
>>>
>>> Le 22 août 2018 à 09:43, Arik Hadas <ahadas(a)redhat.com> a écrit :
>>>
>>>
>>>
>>> On Wed, Aug 22, 2018 at 3:07 PM Daniel Renaud <
>>> daniel(a)djj-consultants.com> wrote:
>>>
>>>> Hi
>>>>
>>>> I didn't upgrade and the run once didn't solve the issue but ..... When
>>>> I use run once again, I remove (in the system tab) Custom emulate machine
>>>> and Custom CPU type (leave it blank) and the run once worked. After that,
>>>> I can shutdown and but without any problem. If I go back to look the
>>>> emulate machine and CPU type, it is back to the same value but now I can
>>>> boot.
>>>>
>>>
>>> Please reply to the list.
>>> You said that you have 3 VMs in that state - could you please check if
>>> you have unmanaged interfaces for the other 2 before fixing them that way?
>>>
>>>
>>>>
>>>> 2018-08-22 7:44 GMT-04:00 Arik Hadas <ahadas(a)redhat.com>:
>>>>
>>>>> Do you remember to what version of oVirt you upgraded the 4.1/4.0/3.6
>>>>> cluster?
>>>>> It sounds like a bug we had (relatively long time ago) that lead to
>>>>> having duplicated unmanaged devices.
>>>>> Can you please check in the VM devices tab if you have unmanaged
>>>>> network interfaces? if you do, it was reported that running the VM in
>>>>> run-once mode would clear those devices (but you can also remove them from
>>>>> the database as well).
>>>>>
>>>>> On Wed, Aug 22, 2018 at 1:38 PM <daniel(a)djj-consultants.com> wrote:
>>>>>
>>>>>> I also tried to export (work) and re-import(work) the same VM and try
>>>>>> to start the imported one with the same result that it didn't start
>>>>>> _______________________________________________
>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>>>>> oVirt Code of Conduct:
>>>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>>>> List Archives:
>>>>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/SVE77SQAO5B...
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Daniel Renaud
>>>> Analyste Sécurité & Système
>>>> DJJ Consultants, http://www.djj-consultants.com
>>>> Téléphone : 418-907-9530, poste 200
>>>> GPG key : 0x7189C9D0
>>>>
>>>
>
>
> --
> Daniel Renaud
> Analyste Sécurité & Système
> DJJ Consultants, http://www.djj-consultants.com
> Téléphone : 418-907-9530, poste 200
> GPG key : 0x7189C9D0
>
6 years, 3 months
SSSD on Hosted Engine
by Douglas Duckworth
Hi
I am trying to configure sssd on my hosted engine. Essentially we control
host access in LDAP so I want sssd to read that thus allow my coworkers to
login to hosted engine vm.
For some reason sssd reports backend offline even though it's resolvable,
pingable, with ports open. I see that it's a SELinux issue which I can
resolve. After changing to permissive SSSD works.
To have system read sssd database I set hosts line in /etc/nsswitch.conf to:
hosts files sss
Though it seems that I did something bad to /etc/nsswitch.conf as now yum,
ping, etc does not work.
Could someone suggest how to restore this file or could anyone share theirs?
Thanks,
Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Weill Cornell Medicine
1300 York - LC-502
E: doug(a)med.cornell.edu
O: 212-746-6305
F: 212-746-8690
6 years, 3 months
Fwd: Re: Single node single node SelfHosted Hyperconverged
by Leo David
---------- Forwarded message ----------
From: Leo David <leoalex(a)gmail.com>
Date: Tue, Jun 12, 2018 at 7:57 PM
Subject: Re: [ovirt-users] Re: Single node single node SelfHosted
Hyperconverged
To: femi adegoke <ovirt(a)fateknollogee.com>
Thank you very much for you response, now it feels I can barelly see the
light !
So:
multipath -ll
3614187705c01820022b002b00c52f72e dm-1 DELL ,PERC H730P Mini
size=931G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
`- 0:2:0:0 sda 8:0 active ready running
lsblk
NAME MAJ:MIN RM SIZE RO
TYPE MOUNTPOINT
sda 8:0 0 931G 0
disk
├─sda1 8:1 0 1G 0
part
├─sda2 8:2 0 930G 0
part
└─3614187705c01820022b002b00c52f72e 253:1 0 931G 0
mpath
├─3614187705c01820022b002b00c52f72e1 253:3 0 1G 0
part /boot
└─3614187705c01820022b002b00c52f72e2 253:4 0 930G 0
part
├─onn-pool00_tmeta 253:6 0 1G 0
lvm
│ └─onn-pool00-tpool 253:8 0 825.2G 0
lvm
│ ├─onn-ovirt--node--ng--4.2.3.1--0.20180530.0+1 253:9 0 798.2G 0
lvm /
│ ├─onn-pool00 253:12 0 825.2G 0
lvm
│ ├─onn-var_log_audit 253:13 0 2G 0
lvm /var/log/audit
│ ├─onn-var_log 253:14 0 8G 0
lvm /var/log
│ ├─onn-var 253:15 0 15G 0
lvm /var
│ ├─onn-tmp 253:16 0 1G 0
lvm /tmp
│ ├─onn-home 253:17 0 1G 0
lvm /home
│ └─onn-var_crash 253:20 0 10G 0
lvm /var/crash
├─onn-pool00_tdata 253:7 0 825.2G 0
lvm
│ └─onn-pool00-tpool 253:8 0 825.2G 0
lvm
│ ├─onn-ovirt--node--ng--4.2.3.1--0.20180530.0+1 253:9 0 798.2G 0
lvm /
│ ├─onn-pool00 253:12 0 825.2G 0
lvm
│ ├─onn-var_log_audit 253:13 0 2G 0
lvm /var/log/audit
│ ├─onn-var_log 253:14 0 8G 0
lvm /var/log
│ ├─onn-var 253:15 0 15G 0
lvm /var
│ ├─onn-tmp 253:16 0 1G 0
lvm /tmp
│ ├─onn-home 253:17 0 1G 0
lvm /home
│ └─onn-var_crash 253:20 0 10G 0
lvm /var/crash
└─onn-swap 253:10 0 4G 0
lvm [SWAP]
sdb 8:16 0 931G 0
disk
└─sdb1 8:17 0 931G 0
part
sdc 8:32 0 4.6T 0
disk
└─sdc1 8:33 0 4.6T 0
part
nvme0n1 259:0 0 1.1T 0
disk
So the multipath "3614187705c01820022b002b00c52f72e" that was shown in the
error is actually the root filesystem, which was created at node
installation ( from iso ).
Is this mpath ok that is activated on sda ?
What should I do in this situation ?
Thank you ?
On Tue, Jun 12, 2018 at 5:38 PM, femi adegoke <ovirt(a)fateknollogee.com>
wrote:
> Are your disks "multipathing"?
>
> What's your output if you run the command multipath -ll
>
> For comparison sake, here is my gdeploy.conf (used for a single host
> gluster install) - lv1 was changed to 62gb
> **Credit for that pastebin to Squeakz on the IRC channel
> https://pastebin.com/LTRQ78aJ
> _______________________________________________
> Users mailing list -- users(a)ovirt.org
> To unsubscribe send an email to users-leave(a)ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: https://www.ovirt.org/communit
> y/about/community-guidelines/
> List Archives: https://lists.ovirt.org/archiv
> es/list/users(a)ovirt.org/message/EEIE4PWUFCXHHTT6PGP2EPFQXIWL6H5P/
>
--
Best regards, Leo David
--
Best regards, Leo David
6 years, 3 months
Re: Can't start the VM ... There are no hosts to use. Check that the cluster contains at least one host in Up state.
by Arik Hadas
On Wed, Aug 22, 2018 at 3:07 PM Daniel Renaud <daniel(a)djj-consultants.com>
wrote:
> Hi
>
> I didn't upgrade and the run once didn't solve the issue but ..... When I
> use run once again, I remove (in the system tab) Custom emulate machine and
> Custom CPU type (leave it blank) and the run once worked. After that, I
> can shutdown and but without any problem. If I go back to look the emulate
> machine and CPU type, it is back to the same value but now I can boot.
>
Please reply to the list.
You said that you have 3 VMs in that state - could you please check if you
have unmanaged interfaces for the other 2 before fixing them that way?
>
> 2018-08-22 7:44 GMT-04:00 Arik Hadas <ahadas(a)redhat.com>:
>
>> Do you remember to what version of oVirt you upgraded the 4.1/4.0/3.6
>> cluster?
>> It sounds like a bug we had (relatively long time ago) that lead to
>> having duplicated unmanaged devices.
>> Can you please check in the VM devices tab if you have unmanaged network
>> interfaces? if you do, it was reported that running the VM in run-once mode
>> would clear those devices (but you can also remove them from the database
>> as well).
>>
>> On Wed, Aug 22, 2018 at 1:38 PM <daniel(a)djj-consultants.com> wrote:
>>
>>> I also tried to export (work) and re-import(work) the same VM and try to
>>> start the imported one with the same result that it didn't start
>>> _______________________________________________
>>> Users mailing list -- users(a)ovirt.org
>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/SVE77SQAO5B...
>>>
>>
>
>
> --
> Daniel Renaud
> Analyste Sécurité & Système
> DJJ Consultants, http://www.djj-consultants.com
> Téléphone : 418-907-9530, poste 200
> GPG key : 0x7189C9D0
>
6 years, 3 months
Re: live migration of hosted engine between two hosts
by Douglas Duckworth
Thanks!
I am downgrading to an older kernel to see if kdump will work. My other
hypervisor works fine with 3.10.0-862.9.1.el7 but this host has a newer one.
Sorry for disorganized nature of this thread.
Can you tell me how to get ib0 set as migration network?
Thanks,
Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Weill Cornell Medicine
1300 York - LC-502
E: doug(a)med.cornell.edu
O: 212-746-6305
F: 212-746-8690
On Wed, Aug 22, 2018 at 9:38 AM, Simone Tiraboschi <stirabos(a)redhat.com>
wrote:
>
>
> On Wed, Aug 22, 2018 at 3:13 PM Douglas Duckworth <dod2014(a)med.cornell.edu>
> wrote:
>
>> OK, after a while this host goes into bad state due to "HA score." Could
>> this be due to kdump not being enabled?
>>
>
> We have an open bug on that: https://bugzilla.redhat.
> com/show_bug.cgi?id=1619365
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bugzilla.redhat.com_...>
> it's not kdump related.
>
> The host will come back to a reasonable HA score by itself after about 10
> minutes.
>
>
>>
>> Thanks,
>>
>> Douglas Duckworth, MSc, LFCS
>> HPC System Administrator
>> Scientific Computing Unit
>> Weill Cornell Medicine
>> 1300 York - LC-502
>> E: doug(a)med.cornell.edu
>> O: 212-746-6305
>> F: 212-746-8690
>>
>>
>> On Wed, Aug 22, 2018 at 7:53 AM, Douglas Duckworth <
>> dod2014(a)med.cornell.edu> wrote:
>>
>>> Thanks everyone!
>>>
>>> During reinstall ran into this error:
>>>
>>> Job for kdump.service failed because the control process exited with
>>> error code. See "systemctl status kdump.service" and "journalctl -xe" for
>>> details.
>>>
>>> 2018-08-22 07:40:57,846-0400 DEBUG otopi.plugins.ovirt_host_deploy.kdump.packages
>>> packages._closeup:291 kdump service failed
>>> Traceback (most recent call last):
>>> File "/tmp/ovirt-yFwX5RcnDb/otopi-plugins/ovirt-host-deploy/kdump/packages.py",
>>> line 289, in _closeup
>>> self.services.state('kdump', state)
>>> File "/tmp/ovirt-yFwX5RcnDb/otopi-plugins/otopi/services/systemd.py",
>>> line 141, in state
>>> service=name,
>>> RuntimeError: Failed to start service 'kdump'
>>> 2018-08-22 07:40:57,846-0400 DEBUG otopi.context
>>> context._executeMethod:143 method exception
>>> Traceback (most recent call last):
>>> File "/tmp/ovirt-yFwX5RcnDb/pythonlib/otopi/context.py", line 133, in
>>> _executeMethod
>>> method['method']()
>>> File "/tmp/ovirt-yFwX5RcnDb/otopi-plugins/ovirt-host-deploy/kdump/packages.py",
>>> line 294, in _closeup
>>> 'kdump service restart failed. Please either redeploy '
>>> RuntimeError: kdump service restart failed. Please either redeploy with
>>> Kdump Integration disabled or fix kdump configuration manually and redeploy
>>> the host
>>> 2018-08-22 07:40:57,847-0400 ERROR otopi.context
>>> context._executeMethod:152 Failed to execute stage 'Closing up': kdump
>>> service restart failed. Please either redeploy with Kdump Integration
>>> disabled or fix kdump configuration manually and│
>>> redeploy the host
>>>
>>>
>>>
>>> I was then able to successfully re-install hypervisor after disabling
>>> kdump.
>>>
>>> Migration now works!!!!
>>>
>>> So both hosts have Infiniband. How can I set that up as Migration
>>> network? I would rather that ovirtmgmt not be the migration network since
>>> it's only 1Gb vs 56Gb FDR.
>>>
>>> Thanks,
>>>
>>> Douglas Duckworth, MSc, LFCS
>>> HPC System Administrator
>>> Scientific Computing Unit
>>> Weill Cornell Medicine
>>> 1300 York - LC-502
>>> E: doug(a)med.cornell.edu
>>> O: 212-746-6305
>>> F: 212-746-8690
>>>
>>>
>>> On Wed, Aug 22, 2018 at 3:58 AM, Maton, Brett <matonb(a)ltresources.co.uk>
>>> wrote:
>>>
>>>> What used to catch me out here, is that you need to set 'Choose hosted
>>>> engine deployment action' to deploy when adding a new physical host.
>>>>
>>>> On 22 August 2018 at 08:51, Simone Tiraboschi <stirabos(a)redhat.com>
>>>> wrote:
>>>>
>>>>> The hosts that are eligible for running the engine VM should be
>>>>> flagged with a silver crown, the host that runs the engine VM with a gold
>>>>> run.
>>>>> In your screenshot I see the gold crow but not the silver one on the
>>>>> second host so you have to double check if hosted-engine stuff has been
>>>>> correctly deployed there.
>>>>>
>>>>>
>>>>> On Wed, Aug 22, 2018 at 4:53 AM Yihui Zhao <yzhao(a)redhat.com> wrote:
>>>>>
>>>>>> When adding the second host, whether to select the
>>>>>> hostedengine deployment?
>>>>>>
>>>>>> On Wed, Aug 22, 2018 at 5:43 AM, Douglas Duckworth <
>>>>>> dod2014(a)med.cornell.edu> wrote:
>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> I am trying to live migrate my hosted engine between two hosts.
>>>>>>>
>>>>>>> Both hosts are now up.
>>>>>>>
>>>>>>> The hosted engine exists on shared NFS storage mounted on both
>>>>>>> hypervisors.
>>>>>>>
>>>>>>> Though when I tried to migrate the VM I am told that's not possible.
>>>>>>>
>>>>>>> Could this be since I never defined migration network? If so I
>>>>>>> tried doing that in the oVirt UI as described https://ovirt.org/
>>>>>>> documentation/admin-guide/chap-Logical_Networks/
>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ovirt.org_documentat...>
>>>>>>> though many of these options have changed.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Douglas Duckworth, MSc, LFCS
>>>>>>> HPC System Administrator
>>>>>>> Scientific Computing Unit
>>>>>>> Weill Cornell Medicine
>>>>>>> 1300 York - LC-502
>>>>>>> E: doug(a)med.cornell.edu
>>>>>>> O: 212-746-6305
>>>>>>> F: 212-746-8690
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_site_p...>
>>>>>>> oVirt Code of Conduct: https://www.ovirt.org/
>>>>>>> community/about/community-guidelines/
>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_commun...>
>>>>>>> List Archives: https://lists.ovirt.org/
>>>>>>> archives/list/users(a)ovirt.org/message/HLK76I7RS6DU5U7MXZHH4R6CWF7N2S
>>>>>>> 6F/
>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ovirt.org_arch...>
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_site_p...>
>>>>>> oVirt Code of Conduct: https://www.ovirt.org/
>>>>>> community/about/community-guidelines/
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_commun...>
>>>>>> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/
>>>>>> message/KRD7YT6SDGBNG2GEKJAVRPZIOK3V5C6B/
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ovirt.org_arch...>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list -- users(a)ovirt.org
>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_site_p...>
>>>>> oVirt Code of Conduct: https://www.ovirt.org/
>>>>> community/about/community-guidelines/
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_commun...>
>>>>> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/
>>>>> message/Q3A5DTSOK5MVFAUHTYMEJ4UMKS4EJBQM/
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ovirt.org_arch...>
>>>>>
>>>>>
>>>>
>>>
>>
6 years, 3 months
Re: live migration of hosted engine between two hosts
by Douglas Duckworth
Thanks everyone!
During reinstall ran into this error:
Job for kdump.service failed because the control process exited with error
code. See "systemctl status kdump.service" and "journalctl -xe" for details.
2018-08-22 07:40:57,846-0400 DEBUG
otopi.plugins.ovirt_host_deploy.kdump.packages packages._closeup:291 kdump
service failed
Traceback (most recent call last):
File
"/tmp/ovirt-yFwX5RcnDb/otopi-plugins/ovirt-host-deploy/kdump/packages.py",
line 289, in _closeup
self.services.state('kdump', state)
File "/tmp/ovirt-yFwX5RcnDb/otopi-plugins/otopi/services/systemd.py",
line 141, in state
service=name,
RuntimeError: Failed to start service 'kdump'
2018-08-22 07:40:57,846-0400 DEBUG otopi.context context._executeMethod:143
method exception
Traceback (most recent call last):
File "/tmp/ovirt-yFwX5RcnDb/pythonlib/otopi/context.py", line 133, in
_executeMethod
method['method']()
File
"/tmp/ovirt-yFwX5RcnDb/otopi-plugins/ovirt-host-deploy/kdump/packages.py",
line 294, in _closeup
'kdump service restart failed. Please either redeploy '
RuntimeError: kdump service restart failed. Please either redeploy with
Kdump Integration disabled or fix kdump configuration manually and redeploy
the host
2018-08-22 07:40:57,847-0400 ERROR otopi.context context._executeMethod:152
Failed to execute stage 'Closing up': kdump service restart failed. Please
either redeploy with Kdump Integration disabled or fix kdump configuration
manually and│
redeploy the host
I was then able to successfully re-install hypervisor after disabling kdump.
Migration now works!!!!
So both hosts have Infiniband. How can I set that up as Migration
network? I would rather that ovirtmgmt not be the migration network since
it's only 1Gb vs 56Gb FDR.
Thanks,
Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Weill Cornell Medicine
1300 York - LC-502
E: doug(a)med.cornell.edu
O: 212-746-6305
F: 212-746-8690
On Wed, Aug 22, 2018 at 3:58 AM, Maton, Brett <matonb(a)ltresources.co.uk>
wrote:
> What used to catch me out here, is that you need to set 'Choose hosted
> engine deployment action' to deploy when adding a new physical host.
>
> On 22 August 2018 at 08:51, Simone Tiraboschi <stirabos(a)redhat.com> wrote:
>
>> The hosts that are eligible for running the engine VM should be flagged
>> with a silver crown, the host that runs the engine VM with a gold run.
>> In your screenshot I see the gold crow but not the silver one on the
>> second host so you have to double check if hosted-engine stuff has been
>> correctly deployed there.
>>
>>
>> On Wed, Aug 22, 2018 at 4:53 AM Yihui Zhao <yzhao(a)redhat.com> wrote:
>>
>>> When adding the second host, whether to select the
>>> hostedengine deployment?
>>>
>>> On Wed, Aug 22, 2018 at 5:43 AM, Douglas Duckworth <
>>> dod2014(a)med.cornell.edu> wrote:
>>>
>>>> Hi
>>>>
>>>> I am trying to live migrate my hosted engine between two hosts.
>>>>
>>>> Both hosts are now up.
>>>>
>>>> The hosted engine exists on shared NFS storage mounted on both
>>>> hypervisors.
>>>>
>>>> Though when I tried to migrate the VM I am told that's not possible.
>>>>
>>>> Could this be since I never defined migration network? If so I tried
>>>> doing that in the oVirt UI as described https://ovirt.org/do
>>>> cumentation/admin-guide/chap-Logical_Networks/
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__ovirt.org_documentat...>
>>>> though many of these options have changed.
>>>>
>>>> Thanks,
>>>>
>>>> Douglas Duckworth, MSc, LFCS
>>>> HPC System Administrator
>>>> Scientific Computing Unit
>>>> Weill Cornell Medicine
>>>> 1300 York - LC-502
>>>> E: doug(a)med.cornell.edu
>>>> O: 212-746-6305
>>>> F: 212-746-8690
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list -- users(a)ovirt.org
>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_site_p...>
>>>> oVirt Code of Conduct: https://www.ovirt.org/communit
>>>> y/about/community-guidelines/
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_commun...>
>>>> List Archives: https://lists.ovirt.org/archiv
>>>> es/list/users(a)ovirt.org/message/HLK76I7RS6DU5U7MXZHH4R6CWF7N2S6F/
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ovirt.org_arch...>
>>>>
>>>>
>>> _______________________________________________
>>> Users mailing list -- users(a)ovirt.org
>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_site_p...>
>>> oVirt Code of Conduct: https://www.ovirt.org/communit
>>> y/about/community-guidelines/
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_commun...>
>>> List Archives: https://lists.ovirt.org/archiv
>>> es/list/users(a)ovirt.org/message/KRD7YT6SDGBNG2GEKJAVRPZIOK3V5C6B/
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ovirt.org_arch...>
>>>
>>
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_site_p...>
>> oVirt Code of Conduct: https://www.ovirt.org/communit
>> y/about/community-guidelines/
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ovirt.org_commun...>
>> List Archives: https://lists.ovirt.org/archiv
>> es/list/users(a)ovirt.org/message/Q3A5DTSOK5MVFAUHTYMEJ4UMKS4EJBQM/
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ovirt.org_arch...>
>>
>>
>
6 years, 3 months