On Tue, Oct 16, 2018 at 3:23 PM Gianluca Cecchi <gianluca.cecchi(a)gmail.com>
wrote:
Hello,
I send a dedicated subject message on this topic (second attempt because
the first one seems not to be present in ovirt archive..).
Also, the reply to my other related message seems not visible inside list
archive page for some reason.
It seems this nasty problem in nested virt using pc-i440fx-rhel7.X.0
machine type with X >= 3 impacts not only vSphere as main hypervisor for
nested KVM, but other hypervisors too (Hyper-V) and other machine types too
and could be due to a bug in KVM, so in the kernel, if I understood
correctly.
In the mean time I'm trying to use a workaround to setup a 3-hosts HCI
environment usign 3 VMs inside ESXi.
My approach:
- cp -p /usr/libexec/qemu-kvm /usr/libexec/qemu-kvm.orig
- rm /usr/libexec/qemu-kvm
- create a new /usr/libexec/qemu-kvm that is a wrapper:
#!/bin/bash
i=0
while [ $# -gt 0 ]; do
case "$1" in
-machine)
shift 2;;
*)
args[i]="$1"
(( i++ ))
shift ;;
esac
done
exec /usr/libexec/qemu-kvm.orig -machine pc-i440fx-rhel7.2.0 "${args[@]}"
- chmod 755 /usr/libexec/qemu-kvm
- chcon system_u:object_r:qemu_exec_t:s0 qemu-kvm
- chcon system_u:object_r:qemu_exec_t:s0 qemu-kvm.orig
And then I proceed with my setup from cockpit.
All goes well, with local hosted engine vm created from appliance,
engine-setup done, host addition done, storage domain for engine done, but
then it arrives a step where guestfish comes into place and I have the
error below.
Executing ps command before guestfish fails I see:
[root@ovirt01 ~]# ps -ef|grep guestf
root 28812 28807 5 16:55 pts/1 00:00:00 guestfish -a
/var/tmp/localvmxmSf0U/images/65f7f081-4d9e-43ae-926f-25807f075f1d/a0a00e73-d3ea-4b9b-bd26-06fe189931f2
--rw -i copy-in /var/tmp/localvmxmSf0U/ifcfg-eth0
/etc/sysconfig/network-scripts : selinux-relabel
/etc/selinux/targeted/contexts/files/file_contexts
/etc/sysconfig/network-scripts/ifcfg-eth0 force:true
root 28833 28812 33 16:55 pts/1 00:00:00
/usr/libexec/qemu-kvm.orig -machine pc-i440fx-rhel7.2.0 -global
virtio-blk-pci.scsi=off -nodefconfig -enable-fips -nodefaults -display none
-cpu host -m 500 -no-reboot -rtc driftfix=slew -no-hpet -global
kvm-pit.lost_tick_policy=discard -kernel
/var/tmp/.guestfs-0/appliance.d/kernel -initrd
/var/tmp/.guestfs-0/appliance.d/initrd -object
rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0
-device virtio-scsi-pci,id=scsi -drive
file=/var/tmp/localvmxmSf0U/images/65f7f081-4d9e-43ae-926f-25807f075f1d/a0a00e73-d3ea-4b9b-bd26-06fe189931f2,cache=writeback,id=hd0,if=none
-device scsi-hd,drive=hd0 -drive
file=/var/tmp/.guestfs-0/appliance.d/root,snapshot=on,id=appliance,cache=unsafe,if=none,format=raw
-device scsi-hd,drive=appliance -device virtio-serial-pci -serial stdio
-chardev socket,path=/tmp/libguestfsAdBLA9/guestfsd.sock,id=channel0
-device virtserialport,chardev=channel0,name=org.libguestfs.channel.0
-append panic=1 console=ttyS0 edd=off udevtimeout=6000
udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory
usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb
selinux=0 quiet TERM=xterm-256color
root 28834 28812 0 16:55 pts/1 00:00:00 guestfish -a
/var/tmp/localvmxmSf0U/images/65f7f081-4d9e-43ae-926f-25807f075f1d/a0a00e73-d3ea-4b9b-bd26-06fe189931f2
--rw -i copy-in /var/tmp/localvmxmSf0U/ifcfg-eth0
/etc/sysconfig/network-scripts : selinux-relabel
/etc/selinux/targeted/contexts/files/file_contexts
/etc/sysconfig/network-scripts/ifcfg-eth0 force:true
But then I get this in gui
libguestfs: error: appliance closed the connection unexpectedly.\nThis
usually means the libguestfs appliance crashed
Complete output
. . .
[ INFO ] TASK [Copy configuration files to the right location on host]
[ INFO ] TASK [Copy configuration archive to storage]
[ INFO ] changed: [localhost]
[ INFO ] TASK [Initialize metadata volume]
[ INFO ] changed: [localhost]
[ INFO ] TASK [include_tasks]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Find the local appliance image]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Set local_vm_disk_path]
[ INFO ] ok: [localhost]
[ INFO ] TASK [Generate DHCP network configuration for the engine VM]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [Generate static network configuration for the engine VM]
[ INFO ] changed: [localhost]
[ INFO ] TASK [Inject network configuration with guestfish]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": true, "cmd":
["guestfish", "-a",
"/var/tmp/localvmxmSf0U/images/65f7f081-4d9e-43ae-926f-25807f075f1d/a0a00e73-d3ea-4b9b-bd26-06fe189931f2",
"--rw", "-i", "copy-in",
"/var/tmp/localvmxmSf0U/ifcfg-eth0",
"/etc/sysconfig/network-scripts", ":", "selinux-relabel",
"/etc/selinux/targeted/contexts/files/file_contexts",
"/etc/sysconfig/network-scripts/ifcfg-eth0", "force:true"],
"delta":
"0:00:01.821590", "end": "2018-10-16 16:55:12.044900",
"msg": "non-zero
return code", "rc": 1, "start": "2018-10-16
16:55:10.223310", "stderr":
"libguestfs: error: appliance closed the connection unexpectedly.\nThis
usually means the libguestfs appliance crashed.\nDo:\n export
LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\nand run the command again. For
further information, read:\n
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\nYou can also
run 'libguestfs-test-tool' and post the *complete* output\ninto a bug
report or message to the libguestfs mailing list.\nlibguestfs: error:
/usr/libexec/qemu-kvm killed by signal 6 (Aborted).\nTo see full error
messages you may need to enable debugging.\nDo:\n export LIBGUESTFS_DEBUG=1
LIBGUESTFS_TRACE=1\nand run the command again. For further information,
read:\n
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\nYou
can also run 'libguestfs-test-tool' and post the *complete* output\ninto a
bug report or message to the libguestfs mailing list.\nlibguestfs: error:
guestfs_launch failed.\nThis usually means the libguestfs appliance failed
to start or crashed.\nDo:\n export LIBGUESTFS_DEBUG=1
LIBGUESTFS_TRACE=1\nand run the command again. For further information,
read:\n
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\nYou
can also run 'libguestfs-test-tool' and post the *complete* output\ninto a
bug report or message to the libguestfs mailing list.", "stderr_lines":
["libguestfs: error: appliance closed the connection unexpectedly.", "This
usually means the libguestfs appliance crashed.", "Do:", " export
LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1", "and run the command again. For
further information, read:", "
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs", "You can
also run 'libguestfs-test-tool' and post the *complete* output", "into
a
bug report or message to the libguestfs mailing list.", "libguestfs: error:
/usr/libexec/qemu-kvm killed by signal 6 (Aborted).", "To see full error
messages you may need to enable debugging.", "Do:", " export
LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1", "and run the command again. For
further information, read:", "
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs", "You can
also run 'libguestfs-test-tool' and post the *complete* output", "into
a
bug report or message to the libguestfs mailing list.", "libguestfs: error:
guestfs_launch failed.", "This usually means the libguestfs appliance
failed to start or crashed.", "Do:", " export LIBGUESTFS_DEBUG=1
LIBGUESTFS_TRACE=1", "and run the command again. For further information,
read:", "
http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs",
"You can also run 'libguestfs-test-tool' and post the *complete*
output",
"into a bug report or message to the libguestfs mailing list."],
"stdout":
"", "stdout_lines": []}
Any hint on how to debug guestfish problem, so where to put th suggested
debug env variables for cockpit to adopt them, or understand if it is not
related with the problem to be nested inside ESXi?
The nodes are ovirt-ng-nodes based on ovirt-node-ng-4.2.6.1-0.20180913.0
Thanks,
Gianluca