----- Original Message -----
> On 11 Feb 2016, at 17:02, Johannes Tiefenbacher <jojo(a)linbit.com> wrote:
>
> Hi,
> finally I am posting something to this list :) I read it for quite some
> time now and I am an ovirt user since 3.0.
Hi,
welcome:)
>
>
> I updated an engine installation from 3.2 to 3.6 (stepwise of course, and
> yes I know that's pretty outdated ;-). Then I updated the associated
> Centos6 hosts vdsm as well, from 3.10.x to 3.16.30. I also set my cluster
> comp level to 3.5(3.6 comp level is only possible with El7 hosts if I
> understood correctly).
>
> After my first failover test a VM could not be restarted, altough the host
> where it was running could correctly be fenced.
>
> The reason according to engine's log was this:
>
> VM xxxxxxxx is down with error. Exit message: internal error process exited
> while connecting to monitor: qemu-kvm: -device
> virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x4:
> Duplicate ID 'virtio-serial0' for device
>
>
> I then recognized that I am not able to run this VM on any host. Ich
> checked the virtual hardware in the engine database and could confirm that
> ALL my VMs had this problem: 2 devices with alias='virtio-serial0’
it may very well be a bug, but it would be quite difficult to say unless it
is reproducible. It may be broken from earlier releases
Arik/Shmuel, maybe it rings a bell?
In 3.6 we changed virtio-serial to be a managed device.
The script named 03_06_0310_change_virtio_serial_to_managed_device.sql changes unmanaged
virtio-serial devices (that were all unmanaged before) to be managed.
A potential flow that will cause this duplication I can think of is:
1. Have a running VM in a pre-3.6 engine - it has unmanaged virtio-serial
2. Upgrade to 3.6 while the VM is running - the unmanaged virtio-serial becomes managed
3. Do something that will change the hash of the devices
=> the engine will add an additional unmanaged virtio-serial device
Why didn't it happen before? because the handling of unmanaged devices was:
1. Upon change in the VM devices (their hash), ask for all the devices (full-list)
2. Remove all previous unmanaged devices
3. Add every device that does not exist in the database
When we add an unmanaged device we generate a new ID (!) - therefore we had to remove all
the previous unmanaged devices before adding the new ones.
If the previous unmanaged virtio-serial became managed, it is not removed and we will end
up having two virtio-serial devices.
@Johannes - is it true that the VM was running before the engine got updated to 3.6 and
wasn't powered-off since then?
I managed to simulate this.
We probably need to prevent the addition of unmanaged virtio-serial in 3.6 engine but IMO
we should also use the ID reported by VDSM instead of generating a new one to eliminate
similar issues in the future.
@Eli, Omer - can you recall why can't we use the ID we get from VDSM for the unmanaged
devices?
(we can continue this discussion in devel-list or in bugzilla..)
>
> e.g.:
>
> ----
> engine=# SELECT * FROM vm_device WHERE vm_device.device = 'virtio-serial'
> AND vm_id = 'cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec' ORDER BY vm_id;
> -[ RECORD 1
> ]-------------+-------------------------------------------------------------
> device_id | 2821d03c-ce88-4613-9095-e88eadcd3792
> vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
> type | controller
> device | virtio-serial
> address |
> boot_order | 0
> spec_params | { }
> is_managed | t
> is_plugged | f
> is_readonly | f
> _create_date | 2016-01-14 08:30:43.797161+01
> _update_date | 2016-02-10 10:04:56.228724+01
> alias | virtio-serial0
> custom_properties | { }
> snapshot_id |
> logical_name |
> is_using_scsi_reservation | f
> -[ RECORD 2
> ]-------------+-------------------------------------------------------------
> device_id | 29e0805f-d836-451a-9ec3-9031baa995e6
> vm_id | cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec
> type | controller
> device | virtio-serial
> address | {bus=0x00, domain=0x0000, type=pci, slot=0x04,
> function=0x0}
> boot_order | 0
> spec_params | { }
> is_managed | f
> is_plugged | t
> is_readonly | f
> _create_date | 2016-02-11 13:47:02.69992+01
> _update_date |
> alias | virtio-serial0
> custom_properties |
> snapshot_id |
> logical_name |
> is_using_scsi_reservation | f
>
> ----
>
> My solution was this:
>
> DELETE FROM vm_device WHERE vm_id='cbfa359f-d0b8-484b-8ec0-cf9b8e4bb3ec'
> AND vm_device.device = 'virtio-serial' AND address = '';
>
> (just renaming one of the aliases to virtio-serial1" did not help)
I believe it is not the right solution, it is better to remove the unmanaged device
1. For consistency
2. We changed the virtio-serial device to be managed in order to prevent a problem with
VM-pools where in some cases Windows OS detects an existing virtio-serial device as a new
device (and therefore pops-up a dialog for searching for an appropriate driver). By having
the virtio-serial device managed we preserve its address and eliminate this problem.
>
>
>
> Is this a known issue? Couldn't find anything so far.
>
> Should I also post this to the developer list? I am not subscribed there
> yet, wanted to check out here first.
I think it would be best to track and have it documented in bugzilla.
Please open a bug (
https://bugzilla.redhat.com)
>
>
> thanks in advance and all the best
> Jojo @ LINBIT VIE
>
>
>
>
>
>
>
>
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users