
25 Aug
2020
25 Aug
'20
10:20 a.m.
> On 25 Aug 2020, at 13:02, Vojtech Juranek <vjuranek@redhat.com> wrote: > > On čtvrtek 20. srpna 2020 14:42:15 CEST Michal Skrivanek wrote: >>> On 20 Aug 2020, at 14:28, Nir Soffer <nsoffer@redhat.com> wrote: >>> >>> On Thu, Aug 20, 2020 at 12:19 PM Vojtech Juranek <vjuranek@redhat.com> >>> wrote: >> >>>> >>>> Hi, >>>> as Fedor is on PTO, I was asked to take over. >>>> >>>> I don't think new functionality is need, more detail list of proposed >>>> flows and changes is bellow, > TL;DR: I'd suggest to >>>> >>>> - change in engine: send PDIV instead of CD path from engine to vdsm >> >> >> the biggest issue here is always with older versions not having this >> functionality yet…. > > yes, I was wrong, this requires new functionality. Therefore Nir proposed to > create a feature page for it. We put together flows which should cover all the > possible issues (including VM recovery). Please review and comment directly > under > > https://github.com/oVirt/ovirt-site/pull/2320 > > (unless you think creating feature page for it is wrong way to go:-) IMHO you’re making it bigger than it needs to be We already have ChangeCD API, I don’t see how adding two different APIs make it significantly better There is definitely a time when old unmaintainable code needs to be rewritten and improved….but it greatly increases the "time to fix”. Either way, I still don’t get "add new drivespec to VM metadata…”. We’re not adding a new drive. If you mean to extend the existing disk metadata with new attribute please say “extend” or something else instead. > >> >>>> - change in vdsm: implement counter of ISOs being used by VMs to know >>>> when we can deactivate volume > - change in vdsm: remove old drivespec >>>> from VM XML when changing/removing CD (and eventually deactivate >>>> volume)>> >>>> >>>> You comments are welcome. >>>> Thanks >>>> Vojta >>>> >>>> Flows >>>> === >>>> >>>> VM without a CD >>>> ------------------------- >>>> >>>> - Should not be possible to insert any CD, this option should not be >>>> available/active in the UI. >> >>> >>> I don't think we have such configuration, all VMs have empty cdrom by >>> default: > >>> >>> <disk type='file' device='cdrom'> >>> >>> <driver name='qemu' error_policy='report'/> >>> <source startupPolicy='optional'/> >>> <target dev='sdc' bus='sata'/> >>> <readonly/> >>> <alias name='ua-d95b29f2-e8a2-4119-9ac2-5b4267e6637d'/> >>> <address type='drive' controller='0' bus='0' target='0' unit='2'/> >>> >>> </disk> >>> >>> >>> But of course if we have such configuration when adding CD is not >> >> >> we don’t. All VMs have always at least that one implicit CD (empty or >> loaded) > We have a case where there is additional CD for >> cloud-init/payload, that additional CD is not “changeable” nor exposed in >> UI anywhere >> >>> possible, the menu/button >>> should be disabled in the UI. >>> >>> >>>> VM without CD, changeCD inserts new ISO >> >> >> please update to make it really clear, there’s no “VM without CD”. VM always >> has CD, just not loaded (empty tray) > so this case doesn’t exist… >> >> >>>> ----------------------------------------------------------------- >>>> >>>> - add new drivespec to VM metadata >>> >>> >>> How failure is handled? >>> >>> >>>> - prepare new drivespec >>> >>> >>> What if vdsm is restarted at this point? >>> >>> How VM recovery should handle this drivespec referring to inactive >>> volume? >>> >>> >>>> - attach new device to VM >>>> - if attaching to VM fails, tear down drivespec and remove drivespec >>>> from VM metadata >> >>> >>> Tearing down and removing drivespec cannot be done atomically. any >>> operation may fail and >>> vdsm may be killed at any point. >>> >>> The design should suggest how vdsm recover from all errors. >>> >>> >>>> VM with CD, changeCD removes current ISO >>>> ------------------------------------------------------------------- >>>> >>>> - tear down previous drivespec >>>> >>>> - if volume with ISO is inactive (as a result e.g. of failure of vdsm >>>> after inserting drivespec into VM metadata, but before activating >>>> volume), continue without error >> >>> >>> Sounds good, and may be already handled in lvm module. You can try to >>> deactivate > LV twice to confirm this. >>> >>> >>>> - if drivespec is used by another VM, don’t deactivate volume >>> >>> >>> This is the tricky part, vdsm does not have a mechanism for tracking >>> usage of block devices, >>> and adding such mechanism is much bigger work than supporting CD on >>> block devices, so >>> it cannot be part of this work. >> >> >> We do have the list of active VMs and their xml/devices, so we do know if >> there are other users, it’s nt an expensive check. Locking it properly so >> there’s no in flight request for change cd could be a bit tricky but it >> doesn’t sound that complex >>> >>> >>>> -remove drivespec from VM metadata >>> >>> >>> What if this fails, or vdsm is restarted before we try to do this? >>> >>> When vdsm starts it recovers running vms, we need to handle this case >>> - drivespec >>> referencing non-existing volume in this flow. >> >> >> remove what exactly? CD device stays there, the "tray is ejected”, device’s >> path is updated to “". > >> >>> >>> >>>> VM with CD, changeCD inserts new ISO and removes old >>>> ------------------------------------------------------------------------- >>>> ------------- >>> >>>> - tear down previous drivespec >>>> >>>> - if volume with ISO is inactive (as a result e.g. of failure of vdsm >>>> after inserting drivespec into VM metadata, but before activating >>>> volume), continue without error > - if drivespec is used by another >>>> VM, don’t deactivate volume >>>> >>>> - remove previous drivespec from VM metadata >>>> - add new drivespec to VM metadata >>>> - prepare new drivespec >>>> - attach new device to VM >>>> - if attaching new drivespac fails, tear down new drivespec and attach >>>> back previous drivespec >> >>> >>> This is too complicated. When I discussed this Fedor, our conclusion >>> was that we don't >>> want to support this complexity, and it will be much easier to >>> implement 2 APIs, one for >>> removing a CD, and one for inserting a CD. >>> >>> With separate APIs, error handling is much easier, and is similar to >>> existing error handling >>> in hotplugDisk and hotunplugDisk. You have good example how to handle >>> all errors in these >>> APIs. >> >> >> there’s a big difference between hotplugs and change CD in terms of the >> devices. It’s supposedly an atomic change. Though having a separate actions >> for load/eject would be probably ok. But not sure if worth it > >> >>> >>> For example, if we have API for removing a CD: >>> >>> 1. engine send VM.removeCD >>> 2. vdsm mark drivespec for removal >>> >>> Possible implementation: >>> >>> >>> <ovirt-vm:device devtype="disk" name="sdc" removing="yes"> >>> ... >>> </ovirt-vm:device> >>> >>> >>> 3. vdsm detach device from VM >> >> >> please update the flows. no device detach/attach… >> >> >>> 4. vsdm deactivate volume >>> 5. vdsm remove drivespec from metadata >>> >>> Vdsm may be killed after any step. When recovering the VM, we can have >>> these cases: >>> >>> 1. vdsm was killed after step 2, the CD is marked for removal, but >>> device is attached to VM, >>> >>> and the volume is active: >>> - change the metadata un-marking the drivespec for removal. >>> >>> 2. vdsm was killed after step 3, the drivespec is marked for removal, >>> and the device is not >>> >>> attached to the VM, but the volume is active: >>> - deactivate the volume >>> - remove the drivespec from the metadata >>> Note that this should not happen before vdsm deactivate unused >>> >>> logical volumes when starting >>> >>> so we should really have only case 3. >>> >>> 3. vdsm was killed after step 4, the drivespec is marked for removal, >>> device not attached >>> >>> to the VM, and the volume is not active: >>> - remove the drivespec from the metadata >>> >>> 4. vdsm was killed after step 5, everything is good in vdsm side. >>> >>> Teardown should not fail in storage layer if volume is still used. In >>> current code we cannot >>> avoid the lvm failures, but we don't have to pass the failure up to >>> the caller, or we can pass >>> specific error like "volume is used", which can be ignored silently by >>> the caller, since this >>> error means someone else is using this volume and is responsible for >>> deactivating it. >>> >>> In case of simple errors that vdsm can report, engine may fail the >>> change CD, or maybe >>> continue to insert the new one, if the old Cd was detached. For >>> example if old CD failed >>> to deactivate, this is storage issue that should not fail attaching of >>> a new CD. Storage layer >>> should make sure the volume is deactivate if nobody uses it, but the >>> user should not care >>> about that. >>> >>> If vdsm was killed, engine will fail the operation, but it must wait >>> until the VM is recovered. >>> At this point engine get the VM XML and can update the real state of >>> the CD, either still >>> attached, or detached. Virt code should not care about deactivation >>> failure. > >>> Storage layer should be concerned with deactivation of active volumes, >>> but I don't think we >>> handle such issues in engine yet. On vdsm side, unused logical volumes >>> are deactivated >>> during startup, see: >>> https://github.com/oVirt/vdsm/blob/f6de2ca915faa719381e22086feb8b65aa4de94 >>> 2/lib/vdsm/storage/lvm.py#L1023 > >>> We probably need to add periodic job in storage to deactivate unused >> >> >> sounds good enough indeed >> >> >>> volumes so we don't wait >>> until the next vdsm restart. This can also be handled in engine but I >>> don't think it is the right place >>> to handle host state. >>> >>> >>>> Proposed changes >>>> =========== >>>> >>>> Engine >>>> ---------- >>>> >>>> - change ChangeDiskCommand to use PDIV (pool, domain, image, volume ID) >>>> instead of ISO path > - modify UI not to provide an option to change CD >>>> for VMs without CD>> >>>> >>>> vdsm >>>> -------- >>>> >>>> - add volume/ISO counter which would count the number of VM using the >>>> given ISO. When CDROM is inserted, the counter is increased, when >>>> ejected, the counter is decreased. ISO volume is deactivated only when >>>> on VM uses it. >> >>> >>> This is a hack for supporting ISO, while the issue of shared block volume >>> is > not limited to ISO. We have this issue in many flows with many >>> partial and broken >>> solution like avoiding activation in engine (image transfer), or >>> avoiding activation in >>> vdsm (snapshot). >>> >>> This should be solved separately in a proper way in vdsm storage. This >>> is a large and >>> complex change that we may have in future version, not something for >>> the next zstream. >>> Unfortunately this area in vdsm was neglected for years, and we have >>> huge technical debt. >>> >>> So for now we will have to live with warning when active ISO image is >>> deactivated, or improve >>> error handling so the caller can detect and handle properly the case >>> of volume in use. I think >>> better error handling will be easy to do and good enough for now. >>> >>> >>>> We already have sqlite DB for managed storage, so we can add DB for >>>> ISO quite easily. >>> >>>> - modify Vm._changeBlockDev to reflect flows suggested above >>> >>> >>> The solution require persistence, since vdsm can be killed, and most >>> we would most likely >>> use a database for this, but this should not be part of the >>> mangedvolume database, since >>> sqlite is not good with concurrent usage. >>> >>> Nir >>> >>> >>> >>>>> Hi Fedor, >>>>> >>>>> >>>>> >>>>>> On 27 Jun 2020, at 14:46, Fedor Gavrilov <fgavrilo@redhat.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> So from what I was able to see in hotplugDisk example, for CDs I would >>>>>> have to do similar: >>>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> 1. parse disk params from stored xml metadata >>>>>> 2. call prepareVolumePath with these params >>>>>> (skip normalizeVdsmImg) >>>>>> 3. call updateDriveIndex >>>>> >>>>> >>>>> >>>>> Why? >>>>> >>>>> >>>>> >>>>>> (skip validating whether volume has leases) >>>>>> 4. drive.getXML() and then hooks.before_disk_hotplug with it >>>>> >>>>> >>>>> >>>>> If only the storage stuff would ever finish moving to xml.... >>>>> But it’s not right to call this hook anyway, iiuc you’re not doing a >>>>> hotplug >>>> >>>> >>>> >>>>> >>>>> >>>>>> 5. call attachDevice with drive XML and handle possible errors >>>>>> >>>>>> >>>>>> >>>>>> I am not sure, however, whether we need what "else" does in >>>>>> hotplugDisk >>>>>> (vm.py line 3468)... >>>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> Looking forward to hearing your opinions, >>>>>> Fedor >>>>>> >>>>>> >>>>>> >>>>>> ----- Original Message ----- >>>>>> From: "Fedor Gavrilov" <fgavrilo@redhat.com> >>>>>> To: "devel" <devel@ovirt.org> >>>>>> Sent: Monday, June 22, 2020 4:37:59 PM >>>>>> Subject: [ovirt-devel] implementing hotplugCd/hotunplugCd in vdsm >>>>>> >>>>>> >>>>>> >>>>>> Hey, >>>>>> >>>>>> >>>>>> >>>>>> So in an attempt to fix change CD functionality we discovered a few >>>>>> other >>>>>> potential >>>>> >>>>> >>>>> CD has a long history of issues caused by historical simplifications. >>>>> What is the intended fix, stop using string paths? >>>>> >>>>> >>>>> >>>>>> issues and what Nir suggested was to implement two [somewhat] new >>>>>> functions in VDSM: hotplug and hotunplug for CDs similar to how it >>>>>> works >>>>>> for normal disks now. >>>>> >>>>> >>>>> This is conceptually different, CD is a removable media, >>>>> hotplug/unplug is for the device itself. We never supported hotplug of >>>>> the device though. >>>>> >>>>> >>>>> >>>>>> Existing changeCD function will be left as is for backwards >>>>>> compatibility. >>>>> >>>>> >>>>> >>>>> Who would use the new function? >>>>> >>>>> >>>>> >>>>>> As I found out, engine already calculates iface and index before >>>>>> invoking >>>>>> VDSM functions, so we will just pass these along with PDIV to the >>>>>> VDSM. >>>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> Suggested flow is, let me quote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> So the complete change CD flow should be: >>>>>>> >>>>>>> >>>>>>> >>>>>>> 1. get the previous drivespec from vm metadata >>>>>>> 2. prepare new drivespec >>>>>>> 3. add new drivespec to vm metadata >>>>>>> 4. attach a new device to vm >>>>> >>>>> >>>>> >>>>> Don’t call it a new device when you just change a media >>>>> >>>>> >>>>> >>>>>>> 5. teardown the previous drivespec >>>>>>> 6. remove previous drivespec from vm metadata >>>>>>> >>>>>>> >>>>>>> >>>>>>> When the vm is stopped, it must do: >>>>>>> >>>>>>> >>>>>>> >>>>>>> 1. get drive spec from vm metadata >>>>>>> 2. teardown drivespec >>>>>>> >>>>>>> >>>>>>> >>>>>>> During attach, there are interesting races: >>>>>>> - what happens if vdsm crashes after step 2? who will teardown the >>>>>>> volume? >>>>>>> maybe you need to add the new drivespec to the metadata first, >>>>>>> before preparing it. >>>>>>> - what happens if attach failed? who will remove the new drive from >>>>>>> the metadata? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Now, what makes hotplugDisk/hotunplugDisk different? From what I >>>>>> understand, the flow is same there, so what difference is there as far >>>>>> as >>>>>> VDSM is concerned? If none, this means if I more or less copy that >>>>>> code, >>>>>> changing minor details and data accordingly for CDs, this should work, >>>>>> shouldn't it? >>>>> >>>>> >>>>> Hotplug/unplug is similar, but I would like to see the proposed change >>>>> in context of engine as well, and I expect it to be a bit more complex >>>>> there. Without that this is not worth it. >>>>> >>>>> Thanks, >>>>> michal >>>>> >>>>> >>>>>> >>>>>> >>>>>> Thanks! >>>>>> Fedor >>>>>> _______________________________________________ >>>>>> Devel mailing list -- devel@ovirt.org >>>>>> To unsubscribe send an email to devel-leave@ovirt.org >>>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html >>>>>> oVirt Code of Conduct: >>>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>> >>>> List >>>> >>>>>> Archives: >>>>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/LAQR3RW4R >>>>>> MT >>>>>> UNFUXL5T4HWLPKXJKEC3Y/ _______________________________________________ >>>>>> Devel mailing list -- devel@ovirt.org >>>>>> To unsubscribe send an email to devel-leave@ovirt.org >>>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html >>>>>> oVirt Code of Conduct: >>>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>> >>>> List >>>> >>>>>> Archives: >>>>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/VKPDAB7XX >>>>>> TQ >>>>>> KPYWAZEHQYTV7TRCQQI2E/ >>>>> >>>>> _______________________________________________ >>>>> Devel mailing list -- devel@ovirt.org >>>>> To unsubscribe send an email to devel-leave@ovirt.org >>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html >>>>> oVirt Code of Conduct: >>>>> https://www.ovirt.org/community/about/community-guidelines/ List >>>>> Archives: >>>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/W2U55OKENZ >>>>> P2Z > MNG7AJDYWE3NKBOPGOA/ >>>> >>>> >>> >>> >> >> _______________________________________________ >> Devel mailing list -- devel@ovirt.org >> To unsubscribe send an email to devel-leave@ovirt.org >> Privacy Statement: https://www.ovirt.org/privacy-policy.html >> oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ List Archives: >> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/BJ2J73ODFQWNM >> HRIOXTKI2YGQD4YV76S/ >