
Hello, would you help me to understand if the dhcp client in an oVirt guest should refresh his dhcp configuration after the guest is resumed? If this is the case, how this should be triggered? The reason why I ask is, that if a VM suspends on a first host, and resumes on a second one, libvirt's nwfilter losses the IP address of the guest, which means that the guest is not reachable until he refreshes dhcp config, if the clean-traffic filter with CTRL_IP_LEARNING=dhcp is used. This scenario might happen in OST basic-suite-master and basic-suite-4.3 in verify_suspend_resume_vm0. Thanks Dominik

On Wed, Apr 3, 2019 at 5:55 PM Dominik Holler <dholler@redhat.com> wrote:
Hello, would you help me to understand if the dhcp client in an oVirt guest should refresh his dhcp configuration after the guest is resumed? If this is the case, how this should be triggered?
The reason why I ask is, that if a VM suspends on a first host, and resumes on a second one, libvirt's nwfilter losses the IP address of the guest, which means that the guest is not reachable until he refreshes dhcp config, if the clean-traffic filter with CTRL_IP_LEARNING=dhcp is used.
Do you know if libvirt's nwfilter transfers the IP address during live migration? I'd suspect that you'd have the same problem there. I believe that libvirt should handle this in BOTH cases (live migration and suspend/resume). It should store the learned IP in the suspended data and revive it on the target host. Calling in +Laine Stump for his opinion. Regardless of this, I think that a guest should request a dhcp renewal upon resume. After all, it may have been suspended for few years. This can be done if we add a "resumed" notification to the guest agent, and have libvvirt/Vdsm trigger after resume.
This scenario might happen in OST basic-suite-master and basic-suite-4.3 in verify_suspend_resume_vm0.
I suspect that we can work around this bug in OST by requesting Engine to resume vm0 on the same host it was suspended from
Thanks Dominik

On Thu, Apr 4, 2019 at 9:02 AM Dan Kenigsberg <danken@redhat.com> wrote:
On Wed, Apr 3, 2019 at 5:55 PM Dominik Holler <dholler@redhat.com> wrote:
Hello, would you help me to understand if the dhcp client in an oVirt guest should refresh his dhcp configuration after the guest is resumed? If this is the case, how this should be triggered?
The reason why I ask is, that if a VM suspends on a first host, and resumes on a second one, libvirt's nwfilter losses the IP address of the guest, which means that the guest is not reachable until he refreshes dhcp config, if the clean-traffic filter with CTRL_IP_LEARNING=dhcp is used.
Do you know if libvirt's nwfilter transfers the IP address during live migration? I'd suspect that you'd have the same problem there. I believe that libvirt should handle this in BOTH cases (live migration and suspend/resume). It should store the learned IP in the suspended data and revive it on the target host.
Not my expertise, but regardless of the below :-), I agree with you.
Calling in +Laine Stump for his opinion.
Regardless of this, I think that a guest should request a dhcp renewal upon resume. After all, it may have been suspended for few years.
I think most, if not all, dhcp clients, would renew, once they realized that time passed by (which is a non-trivial, but different, issue), if their lease expired. I think Dominik's question was about the case that the lease was _not_ expired.
This can be done if we add a "resumed" notification to the guest agent, and have libvvirt/Vdsm trigger after resume.
That's also just fine for me. I very briefly looked at all relevant agents (ovirt, spice, qemu) and didn't find 'dhcp' in any of them. So I agree it makes sense to file an RFE on one of them for this. However, we can't always rely on this - the agent might not be available, etc.
This scenario might happen in OST basic-suite-master and basic-suite-4.3 in verify_suspend_resume_vm0.
I suspect that we can work around this bug in OST by requesting Engine to resume vm0 on the same host it was suspended from
And also, for this very specific use case, to simply have much shorter leases. It seems like we currently use dnsmasq's default, which is one hour. I guess setting it to even 2 minutes won't do much damage, but of course this needs testing. Best regards, -- Didi

On 4 Apr 2019, at 09:45, Yedidyah Bar David <didi@redhat.com> wrote:
On Thu, Apr 4, 2019 at 9:02 AM Dan Kenigsberg <danken@redhat.com> wrote:
On Wed, Apr 3, 2019 at 5:55 PM Dominik Holler <dholler@redhat.com> wrote:
Hello, would you help me to understand if the dhcp client in an oVirt guest should refresh his dhcp configuration after the guest is resumed? If this is the case, how this should be triggered?
The reason why I ask is, that if a VM suspends on a first host, and resumes on a second one, libvirt's nwfilter losses the IP address of the guest, which means that the guest is not reachable until he refreshes dhcp config, if the clean-traffic filter with CTRL_IP_LEARNING=dhcp is used.
Do you know if libvirt's nwfilter transfers the IP address during live migration? I'd suspect that you'd have the same problem there.
AFAIK we don’t have that problem
I believe that libvirt should handle this in BOTH cases (live migration and suspend/resume). It should store the learned IP in the suspended data and revive it on the target host.
What about potential conflicts?
Not my expertise, but regardless of the below :-), I agree with you.
Calling in +Laine Stump for his opinion.
Regardless of this, I think that a guest should request a dhcp renewal upon resume. After all, it may have been suspended for few years.
I think most, if not all, dhcp clients, would renew, once they realized that time passed by (which is a non-trivial, but different, issue),
We resync time on resume. Libvirt was supposed to do that but we were not able to convince them, so it’s in vdsm now
if their lease expired. I think Dominik's question was about the case that the lease was _not_ expired.
This can be done if we add a "resumed" notification to the guest agent, and have libvvirt/Vdsm trigger after resume.
We have guest side hooks. Only in ovirt-ga
That's also just fine for me. I very briefly looked at all relevant agents (ovirt, spice, qemu) and didn't find 'dhcp' in any of them. So I agree it makes sense to file an RFE on one of them for this.
qemu-ga most definitely. We do not have ovirt-ga in el8 and we do not plan to add any new features to it. Thanks, michal
However, we can't always rely on this - the agent might not be available, etc.
This scenario might happen in OST basic-suite-master and basic-suite-4.3 in verify_suspend_resume_vm0.
I suspect that we can work around this bug in OST by requesting Engine to resume vm0 on the same host it was suspended from
And also, for this very specific use case, to simply have much shorter leases. It seems like we currently use dnsmasq's default, which is one hour. I guess setting it to even 2 minutes won't do much damage, but of course this needs testing.
Best regards, -- Didi _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/J5EV5YCMV67XZD...

On Thu, 4 Apr 2019, 20:54 Michal Skrivanek, <michal.skrivanek@redhat.com> wrote:
On 4 Apr 2019, at 09:45, Yedidyah Bar David <didi@redhat.com> wrote:
On Thu, Apr 4, 2019 at 9:02 AM Dan Kenigsberg <danken@redhat.com> wrote:
On Wed, Apr 3, 2019 at 5:55 PM Dominik Holler <dholler@redhat.com> wrote:
Hello, would you help me to understand if the dhcp client in an oVirt guest should refresh his dhcp configuration after the guest is resumed? If this is the case, how this should be triggered?
The reason why I ask is, that if a VM suspends on a first host, and resumes on a second one, libvirt's nwfilter losses the IP address of the guest, which means that the guest is not reachable until he refreshes dhcp config, if the clean-traffic filter with CTRL_IP_LEARNING=dhcp is used.
Do you know if libvirt's nwfilter transfers the IP address during live migration? I'd suspect that you'd have the same problem there.
AFAIK we don’t have that problem
I believe that libvirt should handle this in BOTH cases (live migration and suspend/resume). It should store the learned IP in the suspended data and revive it on the target host.
What about potential conflicts?
I'm not sure which conflicts you refer to. We assume that all cluster hosts have access to the same L2 network and hence a single dhcp server.
Not my expertise, but regardless of the below :-), I agree with you.
Calling in +Laine Stump for his opinion.
Regardless of this, I think that a guest should request a dhcp renewal upon resume. After all, it may have been suspended for few years.
I think most, if not all, dhcp clients, would renew, once they realized that time passed by (which is a non-trivial, but different, issue),
We resync time on resume. Libvirt was supposed to do that but we were not able to convince them, so it’s in vdsm now
if their lease expired. I think Dominik's question was about the case that the lease was _not_ expired.
This can be done if we add a "resumed" notification to the guest agent, and have libvvirt/Vdsm trigger after resume.
We have guest side hooks. Only in ovirt-ga
Does it already have one for post-resume?
That's also just fine for me. I very briefly looked at all relevant agents (ovirt, spice, qemu) and didn't find 'dhcp' in any of them. So I agree it makes sense to file an RFE on one of them for this.
qemu-ga most definitely. We do not have ovirt-ga in el8 and we do not plan to add any new features to it.
Thanks, michal
However, we can't always rely on this - the agent might not be available, etc.
This scenario might happen in OST basic-suite-master and basic-suite-4.3 in verify_suspend_resume_vm0.
I suspect that we can work around this bug in OST by requesting Engine to resume vm0 on the same host it was suspended from
And also, for this very specific use case, to simply have much shorter leases. It seems like we currently use dnsmasq's default, which is one hour. I guess setting it to even 2 minutes won't do much damage, but of course this needs testing.
Best regards, -- Didi _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/J5EV5YCMV67XZD...
participants (4)
-
Dan Kenigsberg
-
Dominik Holler
-
Michal Skrivanek
-
Yedidyah Bar David