Hi,
On Wed, 2019-09-25 at 15:42 +0300, Amit Bawer wrote:
> According to resolution of [1] it's a multipathd/udev configuration
> issue. Could be worth to track this issue.
>
> [1] https://tracker.ceph.com/issues/12763
Thanks, that certainly looks like a smoking gun to me, in the logs:
Sep 25 12:27:45 mario multipathd: rbd29: add path (uevent)
Sep 25 12:27:45 mario multipathd: rbd29: spurious uevent, path already
in pathvec
Sep 25 12:27:45 mario multipathd: rbd29: HDIO_GETGEO failed with 25
Sep 25 12:27:45 mario multipathd: rbd29: failed to get path uid
Sep 25 12:27:45 mario multipathd: uevent trigger error
Dan
>
> On Wed, Sep 25, 2019 at 3:18 PM Dan Poltawski <
> dan.poltawski@tnp.net.uk> wrote:
> > On ovirt 4.3.5 we are seeing various problems related to the rbd
> > device staying mapped after a guest has been live migrated. This
> > causes problems migrating the guest back, as well as rebooting the
> > guest when it starts back up on the original host. The error
> > returned is ‘nrbd: unmap failed: (16) Device or resource busy’.
> > I’ve pasted the full vdsm log below.
> >
> > As far as I can tell this isn’t happening 100% of the time, and
> > seems to be more prevalent on busy guests.
> >
> > (Not sure if I should create a bug for this, so thought I’d start
> > here first)
> >
> > Thanks,
> >
> > Dan
> >
> >
> > Sep 24 19:26:18 mario vdsm[5485]: ERROR FINISH detach_volume
> > error=Managed Volume Helper failed.: ('Error executing helper:
> > Command [\'/usr/libexec/vdsm/managedvolume-helper\', \'detach\']
> > failed with rc=1 out=\'\' err=\'oslo.privsep.daemon: Running
> > privsep helper: [\\\'sudo\\\', \\\'privsep-helper\\\', \\\'
> > --privsep_context\\\', \\\'os_brick.privileged.default\\\', \\\'
> > --privsep_sock_path\\\',
> > \\\'/tmp/tmptQzb10/privsep.sock\\\']\\noslo.privsep.daemon: Spawned
> > new privsep daemon via rootwrap\\noslo.privsep.daemon: privsep
> > daemon starting\\noslo.privsep.daemon: privsep process running with
> > uid/gid: 0/0\\noslo.privsep.daemon: privsep process running with
> > capabilities (eff/prm/inh):
> > CAP_SYS_ADMIN/CAP_SYS_ADMIN/none\\noslo.privsep.daemon: privsep
> > daemon running as pid 76076\\nTraceback (most recent call
> > last):\\n File "/usr/libexec/vdsm/managedvolume-helper", line 154,
> > in <module>\\n sys.exit(main(sys.argv[1:]))\\n File
> > "/usr/libexec/vdsm/managedvolume-helper", line 77, in main\\n
> > args.command(args)\\n File "/usr/libexec/vdsm/managedvolume-
> > helper", line 149, in detach\\n ignore_errors=False)\\n File
> > "/usr/lib/python2.7/site-packages/vdsm/storage/nos_brick.py", line
> > 121, in disconnect_volume\\n run_as_root=True)\\n File
> > "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52,
> > in _execute\\n result = self.__execute(*args, **kwargs)\\n File
> > "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py",
> > line 169, in execute\\n return execute_root(*cmd, **kwargs)\\n
> > File "/usr/lib/python2.7/site-
> > packages/oslo_privsep/priv_context.py", line 241, in _wrap\\n
> > return self.channel.remote_call(name, args, kwargs)\\n File
> > "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line
> > 203, in remote_call\\n raise
> > exc_type(*result[2])\\noslo_concurrency.processutils.ProcessExecuti
> > onError: Unexpected error while running command.\\nCommand: rbd
> > unmap /dev/rbd/rbd/volume-0e8c1056-45d6-4740-934d-eb07a9f73160 --
> > conf /tmp/brickrbd_LCKezP --id ovirt --mon_host 172.16.10.13:3300
> > --mon_host 172.16.10.14:3300 --mon_host 172.16.10.12:6789\\nExit
> > code: 16\\nStdout: u\\\'\\\'\\nStderr: u\\\'rbd: sysfs write
> > failed\\\\nrbd: unmap failed: (16) Device or resource
> > busy\\\\n\\\'\\n\'',)#012Traceback (most recent call last):#012
> > File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line
> > 124, in method#012 ret = func(*args, **kwargs)#012 File
> > "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1766, in
> > detach_volume#012 return
> > managedvolume.detach_volume(vol_id)#012 File
> > "/usr/lib/python2.7/site-packages/vdsm/storage/managedvolume.py",
> > line 67, in wrapper#012 return func(*args, **kwargs)#012 File
> > "/usr/lib/python2.7/site-packages/vdsm/storage/managedvolume.py",
> > line 135, in detach_volume#012 run_helper("detach",
> > vol_info)#012 File "/usr/lib/python2.7/site-
> > packages/vdsm/storage/managedvolume.py", line 179, in
> > run_helper#012 sub_cmd, cmd_input=cmd_input)#012 File
> > "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line
> > 56, in __call__#012 return callMethod()#012 File
> > "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line
> > 54, in <lambda>#012 **kwargs)#012 File "<string>", line 2, in
> > managedvolume_run_helper#012 File
> > "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in
> > _callmethod#012 raise convert_to_error(kind,
> > result)#012ManagedVolumeHelperFailed: Managed Volume Helper
> > failed.: ('Error executing helper: Command
> > [\'/usr/libexec/vdsm/managedvolume-helper\', \'detach\'] failed
> > with rc=1 out=\'\' err=\'oslo.privsep.daemon: Running privsep
> > helper: [\\\'sudo\\\', \\\'privsep-helper\\\', \\\'
> > --privsep_context\\\', \\\'os_brick.privileged.default\\\', \\\'
> > --privsep_sock_path\\\',
> > \\\'/tmp/tmptQzb10/privsep.sock\\\']\\noslo.privsep.daemon: Spawned
> > new privsep daemon via rootwrap\\noslo.privsep.daemon: privsep
> > daemon starting\\noslo.privsep.daemon: privsep process running with
> > uid/gid: 0/0\\noslo.privsep.daemon: privsep process running with
> > capabilities (eff/prm/inh):
> > CAP_SYS_ADMIN/CAP_SYS_ADMIN/none\\noslo.privsep.daemon: privsep
> > daemon running as pid 76076\\nTraceback (most recent call
> > last):\\n File "/usr/libexec/vdsm/managedvolume-helper", line 154,
> > in <module>\\n sys.exit(main(sys.argv[1:]))\\n File
> > "/usr/libexec/vdsm/managedvolume-helper", line 77, in main\\n
> > args.command(args)\\n File "/usr/libexec/vdsm/managedvolume-
> > helper", line 149, in detach\\n ignore_errors=False)\\n File
> > "/usr/lib/python2.7/site-packages/vdsm/storage/nos_brick.py", line
> > 121, in disconnect_volume\\n run_as_root=True)\\n File
> > "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52,
> > in _execute\\n result = self.__execute(*args, **kwargs)\\n File
> > "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py",
> > line 169, in execute\\n return execute_root(*cmd, **kwargs)\\n
> > File "/usr/lib/python2.7/site-
> > packages/oslo_privsep/priv_context.py", line 241, in _wrap\\n
> > return self.channel.remote_call(name, args, kwargs)\\n File
> > "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line
> > 203, in remote_call\\n raise
> > exc_type(*result[2])\\noslo_concurrency.processutils.ProcessExecuti
> > onError: Unexpected error while running command.\\nCommand: rbd
> > unmap /dev/rbd/rbd/volume-0e8c1056-45d6-4740-934d-eb07a9f73160 --
> > conf /tmp/brickrbd_LCKezP --id ovirt --mon_host 172.16.10.13:3300
> > --mon_host 172.16.10.14:3300 --mon_host 172.16.10.12:6789\\nExit
> > code: 16\\nStdout: u\\\'\\\'\\nStderr: u\\\'rbd: sysfs write
> > failed\\\\nrbd: unmap failed: (16) Device or resource
> > busy\\\\n\\\'\\n\'',)
> >
> >
> > The Networking People (TNP) Limited. Registered office: Network
> > House, Caton Rd, Lancaster, LA1 3PE. Registered in England & Wales
> > with company number: 07667393
> > This email and any files transmitted with it are confidential and
> > intended solely for the use of the individual or entity to whom
> > they are addressed. If you have received this email in error please
> > notify the system manager. This message contains confidential
> > information and is intended only for the individual named. If you
> > are not the named addressee you should not disseminate, distribute
> > or copy this e-mail. Please notify the sender immediately by e-mail
> > if you have received this e-mail by mistake and delete this e-mail
> > from your system. If you are not the intended recipient you are
> > notified that disclosing, copying, distributing or taking any
> > action in reliance on the contents of this information is strictly
> > prohibited.
> > _______________________________________________
> > Users mailing list -- users@ovirt.org
> > To unsubscribe send an email to users-leave@ovirt.org
> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct:
> > https://www.ovirt.org/community/about/community-guidelines/
> > List Archives:
> > https://lists.ovirt.org/archives/list/users@ovirt.org/message/PVGTQPXCTEQI4LUUSXDRLSIH3GXXQC2N/
________________________________
The Networking People (TNP) Limited. Registered office: Network House, Caton Rd, Lancaster, LA1 3PE. Registered in England & Wales with company number: 07667393
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/Q3LW7JSPVE6D6XPON7FZ7UDOCEVGERZP/