----- Original Message -----
----- Original Message -----
>
> > On 11 Feb 2016, at 17:02, Johannes Tiefenbacher <jojo(a)linbit.com> wrote:
> >
> > Hi,
> > finally I am posting something to this list :) I read it for quite some
> > time now and I am an ovirt user since 3.0.
>
> Hi,
> welcome:)
>
> >
> >
> > I updated an engine installation from 3.2 to 3.6 (stepwise of course, and
> > yes I know that's pretty outdated ;-). Then I updated the associated
> > Centos6 hosts vdsm as well, from 3.10.x to 3.16.30. I also set my cluster
> > comp level to 3.5(3.6 comp level is only possible with El7 hosts if I
> > understood correctly).
> >
> > After my first failover test a VM could not be restarted, altough the
> > host
> > where it was running could correctly be fenced.
> >
> > The reason according to engine's log was this:
> >
> > VM xxxxxxxx is down with error. Exit message: internal error process
> > exited
> > while connecting to monitor: qemu-kvm: -device
> > virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4:
> > Duplicate ID 'virtio-serial0' for device
> >
> >
> > I then recognized that I am not able to run this VM on any host. Ich
> > checked the virtual hardware in the engine database and could confirm
> > that
> > ALL my VMs had this problem: 2 devices with alias='virtio-serial0’
>
> it may very well be a bug, but it would be quite difficult to say unless it
> is reproducible. It may be broken from earlier releases
> Arik/Shmuel, maybe it rings a bell?
In 3.6 we changed virtio-serial to be a managed device.
The script named 03_06_0310_change_virtio_serial_to_managed_device.sql
changes unmanaged virtio-serial devices (that were all unmanaged before) to
be managed.
A potential flow that will cause this duplication I can think of is:
1. Have a running VM in a pre-3.6 engine - it has unmanaged virtio-serial
2. Upgrade to 3.6 while the VM is running - the unmanaged virtio-serial
becomes managed
3. Do something that will change the hash of the devices
=> the engine will add an additional unmanaged virtio-serial device
Why didn't it happen before? because the handling of unmanaged devices was:
1. Upon change in the VM devices (their hash), ask for all the devices
(full-list)
2. Remove all previous unmanaged devices
3. Add every device that does not exist in the database
When we add an unmanaged device we generate a new ID (!) - therefore we had
to remove all the previous unmanaged devices before adding the new ones.
If the previous unmanaged virtio-serial became managed, it is not removed and
we will end up having two virtio-serial devices.
@Johannes - is it true that the VM was running before the engine got updated
to 3.6 and wasn't powered-off since then?
I managed to simulate this.
We probably need to prevent the addition of unmanaged virtio-serial in 3.6
engine but IMO we should also use the ID reported by VDSM instead of
generating a new one to eliminate similar issues in the future.
@Eli, Omer - can you recall why can't we use the ID we get from VDSM for the
unmanaged devices?
(we can continue this discussion in devel-list or in bugzilla..)
>
> >
> > e.g.:
> >
> > ----
> > engine=# SELECT * FROM vm_device WHERE vm_device.device =
'virtio-serial'
> > AND vm_id = 'cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' ORDER BY vm_id;
> > -[ RECORD 1
> > ]-------------+-------------------------------------------------------------
> > device_id | 2821d03c-ce88-4613-9095-e88eadcd3792
> > vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
> > type | controller
> > device | virtio-serial
> > address |
> > boot_order | 0
> > spec_params | { }
> > is_managed | t
> > is_plugged | f
> > is_readonly | f
> > _create_date | 2016-01-14 08:30:43.797161+01
> > _update_date | 2016-02-10 10:04:56.228724+01
> > alias | virtio-serial0
> > custom_properties | { }
> > snapshot_id |
> > logical_name |
> > is_using_scsi_reservation | f
> > -[ RECORD 2
> > ]-------------+-------------------------------------------------------------
> > device_id | 29e0805f-d836-451a-9ec3-9031baa995e6
> > vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
> > type | controller
> > device | virtio-serial
> > address | {bus=0x00, domain=0x0000, type=pci,
> > slot=0x04,
> > function=0x0}
> > boot_order | 0
> > spec_params | { }
> > is_managed | f
> > is_plugged | t
> > is_readonly | f
> > _create_date | 2016-02-11 13:47:02.69992+01
> > _update_date |
> > alias | virtio-serial0
> > custom_properties |
> > snapshot_id |
> > logical_name |
> > is_using_scsi_reservation | f
> >
> > ----
> >
> > My solution was this:
> >
> > DELETE FROM vm_device WHERE
vm_id='cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec'
> > AND vm_device.device = 'virtio-serial' AND address = '';
> >
> > (just renaming one of the aliases to virtio-serial1" did not help)
I believe it is not the right solution, it is better to remove the unmanaged
device
1. For consistency
2. We changed the virtio-serial device to be managed in order to prevent a
problem with VM-pools where in some cases Windows OS detects an existing
virtio-serial device as a new device (and therefore pops-up a dialog for
searching for an appropriate driver). By having the virtio-serial device
managed we preserve its address and eliminate this problem.
And then to restart the VM of course, otherwise it will be added again the next time the
devices change..
> >
> >
> >
> > Is this a known issue? Couldn't find anything so far.
> >
> > Should I also post this to the developer list? I am not subscribed there
> > yet, wanted to check out here first.
I think it would be best to track and have it documented in bugzilla.
Please open a bug (
https://bugzilla.redhat.com)
> >
> >
> > thanks in advance and all the best
> > Jojo @ LINBIT VIE
> >
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users(a)ovirt.org
> >
http://lists.ovirt.org/mailman/listinfo/users
>
>