[ovirt-users] Help! My hosted engine lost his nic!

Simone Tiraboschi stirabos at redhat.com
Thu Dec 1 17:11:32 UTC 2016


On Thu, Dec 1, 2016 at 5:16 PM, Cristian Mammoli <c.mammoli at apra.it> wrote:

> Here it is: http://cloud.apra.it/index.php/s/4cdcde8cafdb7a1c2c2374b02dc
> e118e
>
> I tarred all the agent.log on both servers.
>
> The engine was running on kvm01 and got shutdown on kvm01 around 10:35 AM
> on 29 November. But I think that's not the problem, it is supposed to shut
> down if the host can't reach the gateway. Probably the nic problem was
> already there but got triggered on reboot
>
> Btw I kept digging: I extracted the ovf from which vm.conf is generated:
>
> ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
> OVF_STORE volume path: /rhev/data-center/mnt/blockSD/
> 2c3585cc-b7bc-4881-85b3-aa6514991a26/images/9c5e2121-f1a3-
> 4886-964c-c74fdfbbb3c1/ff765055-09c5-4b05-9cc7-5277b15c5d08
>
> # tar xvf /rhev/data-center/mnt/blockSD/2c3585cc-b7bc-4881-85b3-aa6514
> 991a26/images/9c5e2121-f1a3-4886-964c-c74fdfbbb3c1/ff76505
> 5-09c5-4b05-9cc7-5277b15c5d08
> 497f5e4a-0c76-441a-b72e-724d7092d07e.ovf
> info.json
>
> In the ovf file there is no Nic section...
>
>
Ciao Cristian,
do you see any interface for the engine VM in the engine admin portal?

Could you please execute this on the engine VM and share its output?
    sudo -u postgres psql engine -c "select * from vm_device where
type='interface' and vm_id='497f5e4a-0c76-441a-b72e-724d7092d07e'"
    sudo -u postgres psql engine -c "select * from vms where vm_guid='
497f5e4a-0c76-441a-b72e-724d7092d07e'"

thanks



> I uploaded the ovf on the same share as the logs
>
> Ty
>
>
> Il 01/12/2016 15:26, Yedidyah Bar David ha scritto:
>
>> On Thu, Dec 1, 2016 at 1:08 PM, Cristian Mammoli <c.mammoli at apra.it>
>> wrote:
>>
>>> Hi, I upgraded an oVirt installation a month ago to the latest 3.6.7.
>>> Before
>>> it was 3.6.0 if I remember correctly.
>>> Everything went fine so far for a month or so.
>>>
>>> A couple of days ago the the default gateway got rebooted and the
>>> physical
>>> server hosting the HE decided to shut down the vm because it could not
>>> ping
>>> the gateway.
>>> The other host restarted the hevm but it now has *no nic*.
>>> As a workaround I attached a virtio nic via virsh but every time the vm
>>> gets
>>> restarted the nic get lost
>>>
>>> After a bit of troubleshooting and digging this is what I found:
>>>
>>> This is the /var/run/ovirt-hosted-engine-ha/vm.conf which, as far as I
>>> understand, gets extracted from the HE storage domain
>>>
>>> emulatedMachine=pc
>>> vmId=497f5e4a-0c76-441a-b72e-724d7092d07e
>>> smp=2
>>> memSize=6144
>>> spiceSecureChannels=smain,sdisplay,sinputs,scursor,splayback
>>> ,srecord,ssmartcard,susbredir
>>> vmName=HostedEngine
>>> display=vnc
>>> devices={index:0,iface:virtio,format:raw,bootOrder:1,address
>>> :{slot:0x06,bus:0x00,domain:0x0000,type:pci,function:0x0},
>>> volumeID:bb3218ba-cbe9-4cd0-b50b-931deae992f7,imageID:d65b82e2-2ad1-
>>> 4f4f-bfad-0277c37f2808,readonly:false,domainID:2c3585cc-
>>> b7bc-4881-85b3-aa6514991a26,deviceId:d65b82e2-2ad1-4f4f-
>>> bfad-0277c37f2808,poolID:00000000-0000-0000-0000-
>>> 000000000000,device:disk,
>>> shared:exclusive,propagateErrors:off,type:disk}
>>> devices={index:2,iface:ide,shared:false,readonly:true,device
>>> Id:8c3179ac-b322-4f5c-9449-c52e3665e0ae,address:{controll
>>> er:0,target:0,unit:0,bus:1,type:drive},device:cdrom,path:,type:disk}
>>> devices={device:cirrus,alias:video0,type:video,deviceId:a994
>>> 68b6-02d4-4a77-8f94-e5df806030f6,address:{slot:0x02,bus:
>>> 0x00,domain:0x0000,type:pci,function:0x0}}
>>> devices={device:virtio-serial,type:controller,deviceId:b7580
>>> 676-19fb-462f-a61e-677b65ad920a,address:{slot:0x03,bus:0x00,
>>> domain:0x0000,type:pci,function:0x0}}
>>> devices={device:usb,type:controller,deviceId:c63092b3-7bd8-
>>> 4b54-bcd3-51f34dce478a,address:{slot:0x01,bus:0x00,domain:
>>> 0x0000,type:pci,function:0x2}}
>>> devices={device:ide,type:controller,deviceId:c77c2c01-6ccc-
>>> 404b-b8d6-5a7f0631a52f,address:{slot:0x01,bus:0x00,domain:
>>> 0x0000,type:pci,function:0x1}}
>>>
>>> As you can see there is no nic, and there is no nic in the qemu-kvm
>>> command-line:
>>> qemu     23290     1 14 00:23 ?        01:44:26 /usr/libexec/qemu-kvm
>>> -name
>>> HostedEngine -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu
>>> qemu64,-svm -m 6144 -realtime mlock=off -s
>>> mp 2,sockets=2,cores=1,threads=1 -uuid 497f5e4a-0c76-441a-b72e-724d70
>>> 92d07e
>>> -smbios type=1,manufacturer=oVirt,product=oVirt
>>> Node,version=7-2.1511.el7.centos.2.10,serial=4C4C4544-004B-571
>>> 0-8044-B9C04F5A3732,uuid=497f5e4a-0c76-441a-b72e-724d7092d07e
>>> -no-user-config -nodefaults -chardev
>>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-Host
>>> edEngine/monitor.sock,serve
>>> r,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc
>>> base=2016-11-30T23:23:26,driftfix=slew -global
>>> kvm-pit.lost_tick_policy=discard -no-hpet -no-reboot -boot strict=on
>>> -device
>>>   piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
>>> virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x3
>>> -drive
>>> file=/var/run/vdsm/storage/2c3585cc-b7bc-4881-85b3-aa6514
>>> 991a26/d65b82e2-2ad1-4f4f-bfad-0277c37f2808/bb3218ba-cbe9-
>>> 4cd0-b50b-931deae992f7,if=none,id=drive-virtio-disk0,for
>>> mat=raw,serial=d65b82e2-2ad1-4f4f-bfad-0277c37f2808,cache=none,werror=st
>>> op,rerror=stop,aio=native -device
>>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virti
>>> o-disk0,id=virtio-disk0,bootindex=1
>>> -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw
>>> -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0
>>> -chardev
>>> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/4
>>> 97f5e4a-0c76-441a-b72e-724d7092d07e.com.redhat.rhevm
>>> .vdsm,server,nowait -device
>>> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel
>>> 0,id=channel0,name=com.redhat.rhevm.vdsm
>>> -chardev socket,id=charchannel1,path=/var/lib/libvirt/qem
>>> u/channels/497f5e4a-0c76-441a-b72e-724d7092d07e.org.qemu.gue
>>> st_agent.0,server,nowait
>>> -device
>>> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel
>>> 1,id=channel1,name=org.qemu.guest
>>> _agent.0 -chardev
>>> socket,id=charchannel2,path=/var/lib/libvirt/qemu/channels/4
>>> 97f5e4a-0c76-441a-b72e-724d7092d07e.org.ovirt.hosted-engine-
>>> setup.0,server,nowait
>>> -device virtserialport,bus
>>> =virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,
>>> name=org.ovirt.hosted-engine-setup.0
>>> -vnc 0:0,password -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -msg
>>> timestamp=on
>>>
>>> I extracted the vm.conf from the storage domain and the nic is there:
>>> mId=497f5e4a-0c76-441a-b72e-724d7092d07e
>>> memSize=6144
>>> display=vnc
>>> devices={index:2,iface:ide,address:{ controller:0, target:0,unit:0,
>>> bus:1,
>>> type:drive},specParams:{},readonly:true,deviceId:857b98b3-
>>> cf43-4c2d-8061-e7f105234a65,path:,device:cdrom,shared
>>> :false,type:disk}
>>> devices={index:0,iface:virtio,format:raw,poolID:00000000-000
>>> 0-0000-0000-000000000000,volumeID:bb3218ba-cbe9-4cd0-b50b-
>>> 931deae992f7,imageID:d65b82e2-2ad1-4f4f-bfad-0277c37f2808,specParams
>>> :{},readonly:false,domainID:2c3585cc-b7bc-4881-85b3-aa651499
>>> 1a26,optional:false,deviceId:d65b82e2-2ad1-4f4f-bfad-
>>> 0277c37f2808,address:{bus:0x00,
>>> slot:0x06, domain:0x0000, type:pci, funct
>>> ion:0x0},device:disk,shared:exclusive,propagateErrors:off,ty
>>> pe:disk,bootOrder:1}
>>> devices={device:scsi,model:virtio-scsi,type:controller}
>>> devices={nicModel:pv,macAddr:00:16:3e:7d:d8:27,linkActive:tr
>>> ue,network:ovirtmgmt,filter:vdsm-no-mac-spoofing,specParams:
>>> {},deviceId:5be8a089-9f51-46dc-a8bd-28422985aa35,address:{bus:0x00
>>> , slot:0x03, domain:0x0000, type:pci,
>>> function:0x0},device:bridge,type:interface}
>>> devices={device:console,specParams:{},type:console,deviceId:
>>> 1644f556-a4ff-4c93-8945-5aa165de2a85,alias:console0}
>>> vmName=HostedEngine
>>> spiceSecureChannels=smain,sdisplay,sinputs,scursor,splayback
>>> ,srecord,ssmartcard,susbredir
>>> smp=2
>>> cpuType=SandyBridge
>>> emulatedMachine=pc
>>>
>>> The local vm.conf gets continuosly overwritten but for some reason the
>>> nic
>>> line gets lost in the process.
>>>
>> Can you please check/share /var/log/ovirt-hosted-engine-ha/agent.log?
>> Preferably all of it (including backups)? Thanks.
>>
>
> --
> Mammoli Cristian
> System administrator
> T. +39 0731 22911
> Via Brodolini 6 | 60035 Jesi (an)
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161201/535c730a/attachment-0001.html>


More information about the Users mailing list