On Tue, Dec 13, 2016 at 1:56 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:


On 13/12/2016 12:38, Gianluca Cecchi wrote:
> flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
> pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb
> rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
> nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx
> est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt
> tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch
> ida arat epb pln pts dtherm hwp hwp_noitfy hwp_act_window hwp_epp
> tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep
> bmi2 erms invpcid mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1
> xsaves
> bogomips: 3600.06
> clflush size: 64
> cache_alignment: 64
> address sizes: 39 bits physical, 48 bits virtual
> power management:
>
> . . .
>
> What is the flag to check?

It's erms, which is there.  But it's not the culprit.

Sorry if you have already tested it, but have you tried using 7.2 kernel
with QEMU 2.6, and then 7.3 kernel with QEMU 2.3?  That would allow
finding the guilty component more easily.

Thanks,


No problem.

- 7.3 kernel with qemu 2.3 seems ok
It is the configuration used for deploying self hosted engine vm

[root@ovirt41 ~]# rpm -q qemu-kvm-ev
qemu-kvm-ev-2.3.0-31.el7_2.21.1.x86_64

[root@ovirt41 ~]# uname -r
3.10.0-514.el7.x86_64
[root@ovirt41 ~]#
( it seems it is the 7.3 kernel, based on https://access.redhat.com/articles/3078 )

[root@ovirt41 ~]# ps -ef| grep qemu-kvm
qemu      53257      1  3 Dec07 ?        05:56:52 /usr/libexec/qemu-kvm -name guest=HostedEngine,debug-threads=on -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu Broadwell,+rtm,+hle -m 6184 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 2a262cdc-9102-4061-841f-ec64333cdad2 -smbios type=1,manufacturer=oVirt,product=oVirt Node,version=7-2.1511.el7.centos.2.10,serial=564D3726-E55D-5C11-DC45-CA1A50480E83,uuid=2a262cdc-9102-4061-841f-ec64333cdad2 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-2-HostedEngine/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2016-12-07T10:16:42,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-reboot -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive if=none,id=drive-ide0-1-0,readonly=on -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/var/run/vdsm/storage/3e7d4336-c2e1-4fdc-99e7-81a0e69cf3a3/286a8fda-b77d-48b8-80a9-15b63e5321a2/63bfeca6-dc92-4145-845d-e785a18de949,format=raw,if=none,id=drive-virtio-disk0,serial=286a8fda-b77d-48b8-80a9-15b63e5321a2,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=32 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:16:3e:08:cc:5a,bus=pci.0,addr=0x3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/2a262cdc-9102-4061-841f-ec64333cdad2.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/2a262cdc-9102-4061-841f-ec64333cdad2.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev socket,id=charchannel2,path=/var/lib/libvirt/qemu/channels/2a262cdc-9102-4061-841f-ec64333cdad2.org.ovirt.hosted-engine-setup.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=org.ovirt.hosted-engine-setup.0 -chardev pty,id=charconsole0 -device virtconsole,chardev=charconsole0,id=console0 -vnc 0:0,password -device VGA,id=video0,vgamem_mb=32,bus=pci.0,addr=0x2 -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x7 -msg timestamp=on


the VM is ok, I can ssh to it:

[root@ovirt41 ~]# ssh ovirt41she.localdomain.local
The authenticity of host 'ovirt41she.localdomain.local (192.168.150.122)' can't be established.
ECDSA key fingerprint is 24:fc:fa:07:14:4e:b3:ea:3e:9b:bc:8a:6a:3e:a7:76.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ovirt41she.localdomain.local,192.168.150.122' (ECDSA) to the list of known hosts.
root@ovirt41she.localdomain.local's password:
[root@ovirt41she ~]#


- 7.2 kernel with qemu-2.6

it seems it works too... see below

for qemu 2.6, I don't yet find in my repos:
[root@ovirt41 ~]# yum update qemu-kvm-ev
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
 * base: repo.de.bigstepcloud.com
 * epel: ftp.nluug.nl
 * extras: mirror.crazynetwork.it
 * ovirt-4.1: ftp.nluug.nl
 * ovirt-4.1-epel: ftp.nluug.nl
No packages marked for update

What I tested, based on Sandro initial mail in the thread was this that I reproduce now:

- put oVirt in maintenance and shutdown  self hosted engine VM
[root@ovirt41 ~]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date                  : True
Hostname                           : ovirt41.localdomain.local
Host ID                            : 1
Engine status                      : {"health": "good", "vm": "up", "detail": "up"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 8d9d58c2
Host timestamp                     : 609397
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=609397 (Tue Dec 13 17:53:41 2016)
        host-id=1
        score=3400
        maintenance=False
        state=EngineUp
        stopped=False
[root@ovirt41 ~]#

[root@ovirt41 ~]# hosted-engine --set-maintenance --mode=global

ssh to the VM and shutdown

[root@ovirt41she ~]# shutdown -h now


coming back to host:
[root@ovirt41 ~]# ps -ef|grep qemu
root     101759   1459  0 17:56 pts/0    00:00:00 grep --color=auto qemu
[root@ovirt41 ~]#

[root@ovirt41 ~]# hosted-engine --vm-status


!! Cluster is in GLOBAL MAINTENANCE mode !!



--== Host 1 status ==--

Status up-to-date                  : True
Hostname                           : ovirt41.localdomain.local
Host ID                            : 1
Engine status                      : {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "up"}
Score                              : 3000
stopped                            : False
Local maintenance                  : False
crc32                              : a246ec87
Host timestamp                     : 609544
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=609544 (Tue Dec 13 17:56:09 2016)
        host-id=1
        score=3000
        maintenance=False
        state=GlobalMaintenance
        stopped=False


!! Cluster is in GLOBAL MAINTENANCE mode !!

[root@ovirt41 ~]#

- install qemu-kvm-ev 2.6

[root@ovirt41 ~]# yum update http://buildlogs.centos.org/centos/7/virt/x86_64/kvm-common/qemu-kvm-ev-2.6.0-27.1.el7.x86_64.rpm http://buildlogs.centos.org/centos/7/virt/x86_64/kvm-common/qemu-img-ev-2.6.0-27.1.el7.x86_64.rpm http://buildlogs.centos.org/centos/7/virt/x86_64/kvm-common/qemu-kvm-common-ev-2.6.0-27.1.el7.x86_64.rpm http://buildlogs.centos.org/centos/7/virt/x86_64/kvm-common/qemu-kvm-tools-ev-2.6.0-27.1.el7.x86_64.rpm

- set kernel 3.10.0-327.36.3.el7 as the default boot one

* Mon Oct 24 2016 CentOS Sources <bugs@centos.org> - 3.10.0-327.36.3.el7
- Apply debranding changes

[root@ovirt41 ~]# grub2-editenv list
saved_entry=CentOS Linux (3.10.0-514.el7.x86_64) 7 (Core)
[root@ovirt41 ~]#

[root@ovirt41 ~]# awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
CentOS Linux (3.10.0-514.el7.x86_64) 7 (Core)
CentOS Linux (3.10.0-327.36.3.el7.x86_64) 7 (Core)
CentOS Linux (3.10.0-327.36.1.el7.x86_64) 7 (Core)
CentOS Linux (3.10.0-327.22.2.el7.x86_64) 7 (Core)
CentOS Linux (3.10.0-327.el7.x86_64) 7 (Core)
CentOS Linux (0-rescue-65496e25d5a842b58090b6a9f4246e68) 7 (Core)
[root@ovirt41 ~]#

[root@ovirt41 ~]# grub2-set-default 'CentOS Linux (3.10.0-327.36.3.el7.x86_64) 7 (Core)'
[root@ovirt41 ~]# grub2-editenv list
saved_entry=CentOS Linux (3.10.0-327.36.3.el7.x86_64) 7 (Core)
[root@ovirt41 ~]#

[root@ovirt41 ~]# grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-3.10.0-514.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-514.el7.x86_64.img
Found linux image: /boot/vmlinuz-3.10.0-327.36.3.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-327.36.3.el7.x86_64.img
Found linux image: /boot/vmlinuz-3.10.0-327.36.1.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-327.36.1.el7.x86_64.img
Found linux image: /boot/vmlinuz-3.10.0-327.22.2.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-327.22.2.el7.x86_64.img
Found linux image: /boot/vmlinuz-3.10.0-327.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-327.el7.x86_64.img
Found linux image: /boot/vmlinuz-0-rescue-65496e25d5a842b58090b6a9f4246e68
Found initrd image: /boot/initramfs-0-rescue-65496e25d5a842b58090b6a9f4246e68.img
done
[root@ovirt41 ~]#


[root@ovirt41 ~]# hosted-engine --set-maintenance --mode=local
[root@ovirt41 ~]#

- reboot host

- after reboot

[root@ovirt41 ~]# uname -r
3.10.0-327.36.3.el7.x86_64
[root@ovirt41 ~]#

exit from maintenance and see if the hosted engine vm starts

[root@ovirt41 ~]# hosted-engine --set-maintenance --mode=none
[root@ovirt41 ~]#

It seems the VM starts....

[root@ovirt41 qemu]# ps -ef|grep qemu
qemu       3485      1 59 18:21 ?        00:00:41 /usr/libexec/qemu-kvm -name guest=HostedEngine,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-HostedEngine/master-key.aes -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu Broadwell,+rtm,+hle -m 6184 -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1 -uuid 2a262cdc-9102-4061-841f-ec64333cdad2 -smbios type=1,manufacturer=oVirt,product=oVirt Node,version=7-2.1511.el7.centos.2.10,serial=564D3726-E55D-5C11-DC45-CA1A50480E83,uuid=2a262cdc-9102-4061-841f-ec64333cdad2 -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-HostedEngine/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2016-12-13T17:21:33,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-reboot -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x3 -drive file=/var/run/vdsm/storage/3e7d4336-c2e1-4fdc-99e7-81a0e69cf3a3/286a8fda-b77d-48b8-80a9-15b63e5321a2/63bfeca6-dc92-4145-845d-e785a18de949,format=raw,if=none,id=drive-virtio-disk0,serial=286a8fda-b77d-48b8-80a9-15b63e5321a2,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-ide0-1-0,readonly=on -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=30,id=hostnet0,vhost=on,vhostfd=32 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:16:3e:08:cc:5a,bus=pci.0,addr=0x2 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/2a262cdc-9102-4061-841f-ec64333cdad2.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/2a262cdc-9102-4061-841f-ec64333cdad2.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev socket,id=charchannel2,path=/var/lib/libvirt/qemu/channels/2a262cdc-9102-4061-841f-ec64333cdad2.org.ovirt.hosted-engine-setup.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=org.ovirt.hosted-engine-setup.0 -msg timestamp=on
root       3845   1943  0 18:22 pts/0    00:00:00 grep --color=auto qemu

and I'm able to connect to it via ssh too

[root@ovirt41 qemu]# ssh ovirt41she.localdomain.local
root@ovirt41she.localdomain.local's password:
Last login: Tue Dec 13 17:55:37 2016 from ovirt41.localdomain.local
[root@ovirt41she ~]#

- So I have to try the mix of 7.3 kernel and qemu 2.6, correct?

Perhaps it was a problem only during install and not happening now that the VM has been deployed?
Gianluca