trying to use Managed Block Storage in 4.3.2 with Ceph / Authentication Keys

Hi, I upgraded my test environment to 4.3.2 and now I'm trying to set up a "Managed Block Storage" domain with our Ceph 12.2 cluster. I think I got all prerequisites, but when saving the configuration for the domain with volume_driver "cinder.volume.drivers.rbd.RBDDriver" (and a couple of other options) I get "VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster" in engine log (full error below). Unfortunately this is a rather generic error message and I don't really know where to look next. Accessing the rbd pool from the engine host with rbd CLI and the configured "rbd_user" works flawlessly... Although I don't think this is directly connected there is one other question that comes up for me: how are libvirt "Authentication Keys" handled with Ceph "Managed Block Storage" domains? With "standalone Cinder" setups like we are using now you have to configure a "provider" of type "OpenStack Block Storage" where you can configure these keys that are referenced in cinder.conf as "rbd_secret_uuid". How is this supposed to work now? Thanks for any advice, we are using oVirt with Ceph heavily and are very interested in a tight integration of oVirt and Ceph. Matthias 2019-04-01 11:14:55,128+02 ERROR [org.ovirt.engine.core.common.utils.cinderlib.CinderlibExecutor] (default task-22) [b6665621-6b85-438e-8c68-266f33e55d79] cinderlib execution failed: Traceback (most recent call last): File "./cinderlib-client.py", line 187, in main args.command(args) File "./cinderlib-client.py", line 275, in storage_stats backend = load_backend(args) File "./cinderlib-client.py", line 217, in load_backend return cl.Backend(**json.loads(args.driver)) File "/usr/lib/python2.7/site-packages/cinderlib/cinderlib.py", line 87, in __init__ self.driver.check_for_setup_error() File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 288, in check_for_setup_error with RADOSClient(self): File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 170, in __init__ self.cluster, self.ioctx = driver._connect_to_rados(pool) File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 346, in _connect_to_rados return _do_conn(pool, remote, timeout) File "/usr/lib/python2.7/site-packages/cinder/utils.py", line 799, in _wrapper return r.call(f, *args, **kwargs) File "/usr/lib/python2.7/site-packages/retrying.py", line 229, in call raise attempt.get() File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get six.reraise(self.value[0], self.value[1], self.value[2]) File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call attempt = Attempt(fn(*args, **kwargs), attempt_number, False) File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 344, in _do_conn raise exception.VolumeBackendAPIException(data=msg) VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster.

Hi, Thanks for trying this out! We added a separate log file for cinderlib in 4.3.2, it should be available under /var/log/ovirt-engine/cinderlib/cinderlib.log They are not perfect yet, and more improvements are coming, but it might provide some insight about the issue
Although I don't think this is directly connected there is one other question that comes up for me: how are libvirt "Authentication Keys" handled with Ceph "Managed Block Storage" domains? With "standalone Cinder" setups like we are using now you have to configure a "provider" of type "OpenStack Block Storage" where you can configure these keys that are referenced in cinder.conf as "rbd_secret_uuid". How is this supposed to work now?
Now you are supposed to pass the secret in the driver options, something like this (using REST): <property> <name>rbd_ceph_conf</name> <value>/etc/ceph/ceph.conf</value> </property> <property> <name>rbd_keyring_conf</name> <value>/etc/ceph/ceph.client.admin.keyring</value> </property> On Mon, Apr 1, 2019 at 12:51 PM Matthias Leopold < matthias.leopold@meduniwien.ac.at> wrote:
Hi,
I upgraded my test environment to 4.3.2 and now I'm trying to set up a "Managed Block Storage" domain with our Ceph 12.2 cluster. I think I got all prerequisites, but when saving the configuration for the domain with volume_driver "cinder.volume.drivers.rbd.RBDDriver" (and a couple of other options) I get "VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster" in engine log (full error below). Unfortunately this is a rather generic error message and I don't really know where to look next. Accessing the rbd pool from the engine host with rbd CLI and the configured "rbd_user" works flawlessly...
Although I don't think this is directly connected there is one other question that comes up for me: how are libvirt "Authentication Keys" handled with Ceph "Managed Block Storage" domains? With "standalone Cinder" setups like we are using now you have to configure a "provider" of type "OpenStack Block Storage" where you can configure these keys that are referenced in cinder.conf as "rbd_secret_uuid". How is this supposed to work now?
Thanks for any advice, we are using oVirt with Ceph heavily and are very interested in a tight integration of oVirt and Ceph.
Matthias
2019-04-01 11:14:55,128+02 ERROR [org.ovirt.engine.core.common.utils.cinderlib.CinderlibExecutor] (default task-22) [b6665621-6b85-438e-8c68-266f33e55d79] cinderlib execution failed: Traceback (most recent call last): File "./cinderlib-client.py", line 187, in main args.command(args) File "./cinderlib-client.py", line 275, in storage_stats backend = load_backend(args) File "./cinderlib-client.py", line 217, in load_backend return cl.Backend(**json.loads(args.driver)) File "/usr/lib/python2.7/site-packages/cinderlib/cinderlib.py", line 87, in __init__ self.driver.check_for_setup_error() File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 288, in check_for_setup_error with RADOSClient(self): File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 170, in __init__ self.cluster, self.ioctx = driver._connect_to_rados(pool) File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 346, in _connect_to_rados return _do_conn(pool, remote, timeout) File "/usr/lib/python2.7/site-packages/cinder/utils.py", line 799, in _wrapper return r.call(f, *args, **kwargs) File "/usr/lib/python2.7/site-packages/retrying.py", line 229, in call raise attempt.get() File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get six.reraise(self.value[0], self.value[1], self.value[2]) File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call attempt = Attempt(fn(*args, **kwargs), attempt_number, False) File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 344, in _do_conn raise exception.VolumeBackendAPIException(data=msg) VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster.
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/G2V53GZEMALXSO...

Am 01.04.19 um 12:07 schrieb Benny Zlotnik:
Hi,
Thanks for trying this out! We added a separate log file for cinderlib in 4.3.2, it should be available under /var/log/ovirt-engine/cinderlib/cinderlib.log They are not perfect yet, and more improvements are coming, but it might provide some insight about the issue
OK, /var/log/ovirt-engine/cinderlib/cinderlib.log says: 2019-04-01 11:14:54,925 - cinder.volume.drivers.rbd - ERROR - Error connecting to ceph cluster. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 337, in _do_conn client.connect() File "rados.pyx", line 885, in rados.Rados.connect (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.11/rpm/el7/BUILD/ceph-12.2.11/build/src/pybind/rados/pyrex/rados.c:9785) OSError: [errno 95] error connecting to the cluster 2019-04-01 11:14:54,930 - root - ERROR - Failure occurred when trying to run command 'storage_stats': Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster. I don't really know what to do with that either. BTW, the cinder version on engine host is "pike" (openstack-cinder-11.2.0-1.el7.noarch)
Although I don't think this is directly connected there is one other question that comes up for me: how are libvirt "Authentication Keys" handled with Ceph "Managed Block Storage" domains? With "standalone Cinder" setups like we are using now you have to configure a "provider" of type "OpenStack Block Storage" where you can configure these keys that are referenced in cinder.conf as "rbd_secret_uuid". How is this supposed to work now?
Now you are supposed to pass the secret in the driver options, something like this (using REST): <property> <name>rbd_ceph_conf</name> <value>/etc/ceph/ceph.conf</value> </property>
<property> <name>rbd_keyring_conf</name> <value>/etc/ceph/ceph.client.admin.keyring</value> </property>
Shall I pass "rbd_secret_uuid" in the driver options? But where is this UUID created? Where is the ceph secret key stored in oVirt? thanks Matthias

OK, /var/log/ovirt-engine/cinderlib/cinderlib.log says:
2019-04-01 11:14:54,925 - cinder.volume.drivers.rbd - ERROR - Error connecting to ceph cluster. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 337, in _do_conn client.connect() File "rados.pyx", line 885, in rados.Rados.connect (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.11/rpm/el7/BUILD/ceph-12.2.11/build/src/pybind/rados/pyrex/rados.c:9785) OSError: [errno 95] error connecting to the cluster 2019-04-01 11:14:54,930 - root - ERROR - Failure occurred when trying to run command 'storage_stats': Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster.
I don't really know what to do with that either. BTW, the cinder version on engine host is "pike" (openstack-cinder-11.2.0-1.el7.noarch) Not sure if the version is related (I know it's been tested with pike), but you can try and install the latest rocky (that's what I use for development)
Shall I pass "rbd_secret_uuid" in the driver options? But where is this UUID created? Where is the ceph secret key stored in oVirt? I don't think it's needed as ceph based volumes are no longer a network disk like in the cinder integration, but it is attached like a regular block device The only things that are a must now are "rbd_keyring_conf" and "rbd_ceph_conf" (you don't need the first if the path to the keyring is configured in the latter) And I think you get the error because it's missing or incorrect, since I manually removed the keyring path from the configuration and got the same error as you

Am 01.04.19 um 13:17 schrieb Benny Zlotnik:
OK, /var/log/ovirt-engine/cinderlib/cinderlib.log says:
2019-04-01 11:14:54,925 - cinder.volume.drivers.rbd - ERROR - Error connecting to ceph cluster. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 337, in _do_conn client.connect() File "rados.pyx", line 885, in rados.Rados.connect (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.11/rpm/el7/BUILD/ceph-12.2.11/build/src/pybind/rados/pyrex/rados.c:9785) OSError: [errno 95] error connecting to the cluster 2019-04-01 11:14:54,930 - root - ERROR - Failure occurred when trying to run command 'storage_stats': Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster.
I don't really know what to do with that either. BTW, the cinder version on engine host is "pike" (openstack-cinder-11.2.0-1.el7.noarch) Not sure if the version is related (I know it's been tested with pike), but you can try and install the latest rocky (that's what I use for development)
I upgraded cinder on engine and hypervisors to rocky and installed missing "ceph-common" packages on hypervisors. I set "rbd_keyring_conf" and "rbd_ceph_conf" as indicated and got as far as adding a "Managed Block Storage" domain and creating a disk (which is also visible through "rbd ls"). I used a keyring that is only authorized for the pool I specified with "rbd_pool". When I try to start the VM it fails and I see the following in supervdsm.log on hypervisor: ManagedVolumeHelperFailed: Managed Volume Helper failed.: ('Error executing helper: Command [\'/usr/libexec/vdsm/managedvolume-helper\', \'attach\'] failed with rc=1 out=\'\' err=\'oslo.privsep.daemon: Running privsep helper: [\\\'sudo\\\', \\\'privsep-helper\\\', \\\'--privsep_context\\\', \\\'os_brick.privileged.default\\\', \\\'--privsep_sock_path\\\', \\\'/tmp/tmp5S8zZV/privsep.sock\\\']\\noslo.privsep.daemon: Spawned new privsep daemon via rootwrap\\noslo.privsep.daemon: privsep daemon starting\\noslo.privsep.daemon: privsep process running with uid/gid: 0/0\\noslo.privsep.daemon: privsep process running with capabilities (eff/prm/inh): CAP_SYS_ADMIN/CAP_SYS_ADMIN/none\\noslo.privsep.daemon: privsep daemon running as pid 15944\\nTraceback (most recent call last):\\n File "/usr/libexec/vdsm/managedvolume-helper", line 154, in <module>\\n sys.exit(main(sys.argv[1:]))\\n File "/usr/libexec/vdsm/managedvolume-helper", line 77, in main\\n args.command(args)\\n File "/usr/libexec/vdsm/managedvolume-helper", line 137, in attach\\n attachment = conn.connect_volume(conn_info[\\\'data\\\'])\\n File "/usr/lib/python2.7/site-packages/vdsm/storage/nos_brick.py", line 96, in connect_volume\\n run_as_root=True)\\n File "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52, in _execute\\n result = self.__execute(*args, **kwargs)\\n File "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py", line 169, in execute\\n return execute_root(*cmd, **kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 207, in _wrap\\n return self.channel.remote_call(name, args, kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 202, in remote_call\\n raise exc_type(*result[2])\\noslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.\\nCommand: rbd map volume-36f5eb75-329e-4bd2-88d0-6f0bfe5d1040 --pool ovirt-test --conf /tmp/brickrbd_RmBvxA --id None --mon_host xxx.xxx.216.45:6789 --mon_host xxx.xxx.216.54:6789 --mon_host xxx.xxx.216.55:6789\\nExit code: 22\\nStdout: u\\\'In some cases useful info is found in syslog - try "dmesg | tail".\\\\n\\\'\\nStderr: u"2019-04-01 15:27:30.743196 7fe0b4632d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.None.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory\\\\nrbd: sysfs write failed\\\\n2019-04-01 15:27:30.746987 7fe0b4632d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.None.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory\\\\n2019-04-01 15:27:30.747896 7fe0b4632d40 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication\\\\n2019-04-01 15:27:30.747903 7fe0b4632d40 0 librados: client.None authentication error (95) Operation not supported\\\\nrbd: couldn\\\'t connect to the cluster!\\\\nrbd: map failed: (22) Invalid argument\\\\n"\\n\'',) I tried to provide a /etc/ceph directory with ceph.conf and client keyring on hypervisors (as configured in driver options). This didn't solve it and doesn't seem to be the right way as the mentioned /tmp/brickrbd_RmBvxA contains the needed keyring data. Please give me some advice what's wrong. thx Matthias

Did you pass the rbd_user when creating the storage domain? On Mon, Apr 1, 2019 at 5:08 PM Matthias Leopold <matthias.leopold@meduniwien.ac.at> wrote:
Am 01.04.19 um 13:17 schrieb Benny Zlotnik:
OK, /var/log/ovirt-engine/cinderlib/cinderlib.log says:
2019-04-01 11:14:54,925 - cinder.volume.drivers.rbd - ERROR - Error connecting to ceph cluster. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 337, in _do_conn client.connect() File "rados.pyx", line 885, in rados.Rados.connect (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.11/rpm/el7/BUILD/ceph-12.2.11/build/src/pybind/rados/pyrex/rados.c:9785) OSError: [errno 95] error connecting to the cluster 2019-04-01 11:14:54,930 - root - ERROR - Failure occurred when trying to run command 'storage_stats': Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster.
I don't really know what to do with that either. BTW, the cinder version on engine host is "pike" (openstack-cinder-11.2.0-1.el7.noarch) Not sure if the version is related (I know it's been tested with pike), but you can try and install the latest rocky (that's what I use for development)
I upgraded cinder on engine and hypervisors to rocky and installed missing "ceph-common" packages on hypervisors. I set "rbd_keyring_conf" and "rbd_ceph_conf" as indicated and got as far as adding a "Managed Block Storage" domain and creating a disk (which is also visible through "rbd ls"). I used a keyring that is only authorized for the pool I specified with "rbd_pool". When I try to start the VM it fails and I see the following in supervdsm.log on hypervisor:
ManagedVolumeHelperFailed: Managed Volume Helper failed.: ('Error executing helper: Command [\'/usr/libexec/vdsm/managedvolume-helper\', \'attach\'] failed with rc=1 out=\'\' err=\'oslo.privsep.daemon: Running privsep helper: [\\\'sudo\\\', \\\'privsep-helper\\\', \\\'--privsep_context\\\', \\\'os_brick.privileged.default\\\', \\\'--privsep_sock_path\\\', \\\'/tmp/tmp5S8zZV/privsep.sock\\\']\\noslo.privsep.daemon: Spawned new privsep daemon via rootwrap\\noslo.privsep.daemon: privsep daemon starting\\noslo.privsep.daemon: privsep process running with uid/gid: 0/0\\noslo.privsep.daemon: privsep process running with capabilities (eff/prm/inh): CAP_SYS_ADMIN/CAP_SYS_ADMIN/none\\noslo.privsep.daemon: privsep daemon running as pid 15944\\nTraceback (most recent call last):\\n File "/usr/libexec/vdsm/managedvolume-helper", line 154, in <module>\\n sys.exit(main(sys.argv[1:]))\\n File "/usr/libexec/vdsm/managedvolume-helper", line 77, in main\\n args.command(args)\\n File "/usr/libexec/vdsm/managedvolume-helper", line 137, in attach\\n attachment = conn.connect_volume(conn_info[\\\'data\\\'])\\n File "/usr/lib/python2.7/site-packages/vdsm/storage/nos_brick.py", line 96, in connect_volume\\n run_as_root=True)\\n File "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52, in _execute\\n result = self.__execute(*args, **kwargs)\\n File "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py", line 169, in execute\\n return execute_root(*cmd, **kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 207, in _wrap\\n return self.channel.remote_call(name, args, kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 202, in remote_call\\n raise exc_type(*result[2])\\noslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.\\nCommand: rbd map volume-36f5eb75-329e-4bd2-88d0-6f0bfe5d1040 --pool ovirt-test --conf /tmp/brickrbd_RmBvxA --id None --mon_host xxx.xxx.216.45:6789 --mon_host xxx.xxx.216.54:6789 --mon_host xxx.xxx.216.55:6789\\nExit code: 22\\nStdout: u\\\'In some cases useful info is found in syslog - try "dmesg | tail".\\\\n\\\'\\nStderr: u"2019-04-01 15:27:30.743196 7fe0b4632d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.None.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory\\\\nrbd: sysfs write failed\\\\n2019-04-01 15:27:30.746987 7fe0b4632d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.None.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory\\\\n2019-04-01 15:27:30.747896 7fe0b4632d40 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication\\\\n2019-04-01 15:27:30.747903 7fe0b4632d40 0 librados: client.None authentication error (95) Operation not supported\\\\nrbd: couldn\\\'t connect to the cluster!\\\\nrbd: map failed: (22) Invalid argument\\\\n"\\n\'',)
I tried to provide a /etc/ceph directory with ceph.conf and client keyring on hypervisors (as configured in driver options). This didn't solve it and doesn't seem to be the right way as the mentioned /tmp/brickrbd_RmBvxA contains the needed keyring data. Please give me some advice what's wrong.
thx Matthias

I added an example for ceph[1] [1] - https://github.com/oVirt/ovirt-site/blob/468c79a05358e20289e7403d9dd24732ab4... On Mon, Apr 1, 2019 at 5:24 PM Benny Zlotnik <bzlotnik@redhat.com> wrote:
Did you pass the rbd_user when creating the storage domain?
On Mon, Apr 1, 2019 at 5:08 PM Matthias Leopold <matthias.leopold@meduniwien.ac.at> wrote:
Am 01.04.19 um 13:17 schrieb Benny Zlotnik:
OK, /var/log/ovirt-engine/cinderlib/cinderlib.log says:
2019-04-01 11:14:54,925 - cinder.volume.drivers.rbd - ERROR - Error connecting to ceph cluster. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 337, in _do_conn client.connect() File "rados.pyx", line 885, in rados.Rados.connect (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.11/rpm/el7/BUILD/ceph-12.2.11/build/src/pybind/rados/pyrex/rados.c:9785) OSError: [errno 95] error connecting to the cluster 2019-04-01 11:14:54,930 - root - ERROR - Failure occurred when trying to run command 'storage_stats': Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster.
I don't really know what to do with that either. BTW, the cinder version on engine host is "pike" (openstack-cinder-11.2.0-1.el7.noarch) Not sure if the version is related (I know it's been tested with pike), but you can try and install the latest rocky (that's what I use for development)
I upgraded cinder on engine and hypervisors to rocky and installed missing "ceph-common" packages on hypervisors. I set "rbd_keyring_conf" and "rbd_ceph_conf" as indicated and got as far as adding a "Managed Block Storage" domain and creating a disk (which is also visible through "rbd ls"). I used a keyring that is only authorized for the pool I specified with "rbd_pool". When I try to start the VM it fails and I see the following in supervdsm.log on hypervisor:
ManagedVolumeHelperFailed: Managed Volume Helper failed.: ('Error executing helper: Command [\'/usr/libexec/vdsm/managedvolume-helper\', \'attach\'] failed with rc=1 out=\'\' err=\'oslo.privsep.daemon: Running privsep helper: [\\\'sudo\\\', \\\'privsep-helper\\\', \\\'--privsep_context\\\', \\\'os_brick.privileged.default\\\', \\\'--privsep_sock_path\\\', \\\'/tmp/tmp5S8zZV/privsep.sock\\\']\\noslo.privsep.daemon: Spawned new privsep daemon via rootwrap\\noslo.privsep.daemon: privsep daemon starting\\noslo.privsep.daemon: privsep process running with uid/gid: 0/0\\noslo.privsep.daemon: privsep process running with capabilities (eff/prm/inh): CAP_SYS_ADMIN/CAP_SYS_ADMIN/none\\noslo.privsep.daemon: privsep daemon running as pid 15944\\nTraceback (most recent call last):\\n File "/usr/libexec/vdsm/managedvolume-helper", line 154, in <module>\\n sys.exit(main(sys.argv[1:]))\\n File "/usr/libexec/vdsm/managedvolume-helper", line 77, in main\\n args.command(args)\\n File "/usr/libexec/vdsm/managedvolume-helper", line 137, in attach\\n attachment = conn.connect_volume(conn_info[\\\'data\\\'])\\n File "/usr/lib/python2.7/site-packages/vdsm/storage/nos_brick.py", line 96, in connect_volume\\n run_as_root=True)\\n File "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52, in _execute\\n result = self.__execute(*args, **kwargs)\\n File "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py", line 169, in execute\\n return execute_root(*cmd, **kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 207, in _wrap\\n return self.channel.remote_call(name, args, kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 202, in remote_call\\n raise exc_type(*result[2])\\noslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.\\nCommand: rbd map volume-36f5eb75-329e-4bd2-88d0-6f0bfe5d1040 --pool ovirt-test --conf /tmp/brickrbd_RmBvxA --id None --mon_host xxx.xxx.216.45:6789 --mon_host xxx.xxx.216.54:6789 --mon_host xxx.xxx.216.55:6789\\nExit code: 22\\nStdout: u\\\'In some cases useful info is found in syslog - try "dmesg | tail".\\\\n\\\'\\nStderr: u"2019-04-01 15:27:30.743196 7fe0b4632d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.None.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory\\\\nrbd: sysfs write failed\\\\n2019-04-01 15:27:30.746987 7fe0b4632d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.None.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory\\\\n2019-04-01 15:27:30.747896 7fe0b4632d40 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication\\\\n2019-04-01 15:27:30.747903 7fe0b4632d40 0 librados: client.None authentication error (95) Operation not supported\\\\nrbd: couldn\\\'t connect to the cluster!\\\\nrbd: map failed: (22) Invalid argument\\\\n"\\n\'',)
I tried to provide a /etc/ceph directory with ceph.conf and client keyring on hypervisors (as configured in driver options). This didn't solve it and doesn't seem to be the right way as the mentioned /tmp/brickrbd_RmBvxA contains the needed keyring data. Please give me some advice what's wrong.
thx Matthias

No, I didn't... I wasn't used to using both "rbd_user" and "rbd_keyring_conf" (I don't use "rbd_keyring_conf" in standalone Cinder), nevermind After fixing that and dealing with the rbd feature issues I could proudly start my first VM with a cinderlib provisioned disk :-) Thanks for help! I'll keep posting my experiences concerning cinderlib to this list. Matthias Am 01.04.19 um 16:24 schrieb Benny Zlotnik:
Did you pass the rbd_user when creating the storage domain?
On Mon, Apr 1, 2019 at 5:08 PM Matthias Leopold <matthias.leopold@meduniwien.ac.at> wrote:
Am 01.04.19 um 13:17 schrieb Benny Zlotnik:
OK, /var/log/ovirt-engine/cinderlib/cinderlib.log says:
2019-04-01 11:14:54,925 - cinder.volume.drivers.rbd - ERROR - Error connecting to ceph cluster. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 337, in _do_conn client.connect() File "rados.pyx", line 885, in rados.Rados.connect (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.11/rpm/el7/BUILD/ceph-12.2.11/build/src/pybind/rados/pyrex/rados.c:9785) OSError: [errno 95] error connecting to the cluster 2019-04-01 11:14:54,930 - root - ERROR - Failure occurred when trying to run command 'storage_stats': Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster.
I don't really know what to do with that either. BTW, the cinder version on engine host is "pike" (openstack-cinder-11.2.0-1.el7.noarch) Not sure if the version is related (I know it's been tested with pike), but you can try and install the latest rocky (that's what I use for development)
I upgraded cinder on engine and hypervisors to rocky and installed missing "ceph-common" packages on hypervisors. I set "rbd_keyring_conf" and "rbd_ceph_conf" as indicated and got as far as adding a "Managed Block Storage" domain and creating a disk (which is also visible through "rbd ls"). I used a keyring that is only authorized for the pool I specified with "rbd_pool". When I try to start the VM it fails and I see the following in supervdsm.log on hypervisor:
ManagedVolumeHelperFailed: Managed Volume Helper failed.: ('Error executing helper: Command [\'/usr/libexec/vdsm/managedvolume-helper\', \'attach\'] failed with rc=1 out=\'\' err=\'oslo.privsep.daemon: Running privsep helper: [\\\'sudo\\\', \\\'privsep-helper\\\', \\\'--privsep_context\\\', \\\'os_brick.privileged.default\\\', \\\'--privsep_sock_path\\\', \\\'/tmp/tmp5S8zZV/privsep.sock\\\']\\noslo.privsep.daemon: Spawned new privsep daemon via rootwrap\\noslo.privsep.daemon: privsep daemon starting\\noslo.privsep.daemon: privsep process running with uid/gid: 0/0\\noslo.privsep.daemon: privsep process running with capabilities (eff/prm/inh): CAP_SYS_ADMIN/CAP_SYS_ADMIN/none\\noslo.privsep.daemon: privsep daemon running as pid 15944\\nTraceback (most recent call last):\\n File "/usr/libexec/vdsm/managedvolume-helper", line 154, in <module>\\n sys.exit(main(sys.argv[1:]))\\n File "/usr/libexec/vdsm/managedvolume-helper", line 77, in main\\n args.command(args)\\n File "/usr/libexec/vdsm/managedvolume-helper", line 137, in attach\\n attachment = conn.connect_volume(conn_info[\\\'data\\\'])\\n File "/usr/lib/python2.7/site-packages/vdsm/storage/nos_brick.py", line 96, in connect_volume\\n run_as_root=True)\\n File "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52, in _execute\\n result = self.__execute(*args, **kwargs)\\n File "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py", line 169, in execute\\n return execute_root(*cmd, **kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 207, in _wrap\\n return self.channel.remote_call(name, args, kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 202, in remote_call\\n raise exc_type(*result[2])\\noslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.\\nCommand: rbd map volume-36f5eb75-329e-4bd2-88d0-6f0bfe5d1040 --pool ovirt-test --conf /tmp/brickrbd_RmBvxA --id None --mon_host xxx.xxx.216.45:6789 --mon_host xxx.xxx.216.54:6789 --mon_host xxx.xxx.216.55:6789\\nExit code: 22\\nStdout: u\\\'In some cases useful info is found in syslog - try "dmesg | tail".\\\\n\\\'\\nStderr: u"2019-04-01 15:27:30.743196 7fe0b4632d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.None.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory\\\\nrbd: sysfs write failed\\\\n2019-04-01 15:27:30.746987 7fe0b4632d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.None.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory\\\\n2019-04-01 15:27:30.747896 7fe0b4632d40 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication\\\\n2019-04-01 15:27:30.747903 7fe0b4632d40 0 librados: client.None authentication error (95) Operation not supported\\\\nrbd: couldn\\\'t connect to the cluster!\\\\nrbd: map failed: (22) Invalid argument\\\\n"\\n\'',)
I tried to provide a /etc/ceph directory with ceph.conf and client keyring on hypervisors (as configured in driver options). This didn't solve it and doesn't seem to be the right way as the mentioned /tmp/brickrbd_RmBvxA contains the needed keyring data. Please give me some advice what's wrong.
thx Matthias
-- Matthias Leopold IT Systems & Communications Medizinische Universität Wien Spitalgasse 23 / BT 88 /Ebene 00 A-1090 Wien Tel: +43 1 40160-21241 Fax: +43 1 40160-921200

Glad to hear it! On Tue, Apr 2, 2019 at 3:53 PM Matthias Leopold <matthias.leopold@meduniwien.ac.at> wrote:
No, I didn't... I wasn't used to using both "rbd_user" and "rbd_keyring_conf" (I don't use "rbd_keyring_conf" in standalone Cinder), nevermind
After fixing that and dealing with the rbd feature issues I could proudly start my first VM with a cinderlib provisioned disk :-)
Thanks for help! I'll keep posting my experiences concerning cinderlib to this list.
Matthias
Am 01.04.19 um 16:24 schrieb Benny Zlotnik:
Did you pass the rbd_user when creating the storage domain?
On Mon, Apr 1, 2019 at 5:08 PM Matthias Leopold <matthias.leopold@meduniwien.ac.at> wrote:
Am 01.04.19 um 13:17 schrieb Benny Zlotnik:
OK, /var/log/ovirt-engine/cinderlib/cinderlib.log says:
2019-04-01 11:14:54,925 - cinder.volume.drivers.rbd - ERROR - Error connecting to ceph cluster. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 337, in _do_conn client.connect() File "rados.pyx", line 885, in rados.Rados.connect (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.11/rpm/el7/BUILD/ceph-12.2.11/build/src/pybind/rados/pyrex/rados.c:9785) OSError: [errno 95] error connecting to the cluster 2019-04-01 11:14:54,930 - root - ERROR - Failure occurred when trying to run command 'storage_stats': Bad or unexpected response from the storage volume backend API: Error connecting to ceph cluster.
I don't really know what to do with that either. BTW, the cinder version on engine host is "pike" (openstack-cinder-11.2.0-1.el7.noarch) Not sure if the version is related (I know it's been tested with pike), but you can try and install the latest rocky (that's what I use for development)
I upgraded cinder on engine and hypervisors to rocky and installed missing "ceph-common" packages on hypervisors. I set "rbd_keyring_conf" and "rbd_ceph_conf" as indicated and got as far as adding a "Managed Block Storage" domain and creating a disk (which is also visible through "rbd ls"). I used a keyring that is only authorized for the pool I specified with "rbd_pool". When I try to start the VM it fails and I see the following in supervdsm.log on hypervisor:
ManagedVolumeHelperFailed: Managed Volume Helper failed.: ('Error executing helper: Command [\'/usr/libexec/vdsm/managedvolume-helper\', \'attach\'] failed with rc=1 out=\'\' err=\'oslo.privsep.daemon: Running privsep helper: [\\\'sudo\\\', \\\'privsep-helper\\\', \\\'--privsep_context\\\', \\\'os_brick.privileged.default\\\', \\\'--privsep_sock_path\\\', \\\'/tmp/tmp5S8zZV/privsep.sock\\\']\\noslo.privsep.daemon: Spawned new privsep daemon via rootwrap\\noslo.privsep.daemon: privsep daemon starting\\noslo.privsep.daemon: privsep process running with uid/gid: 0/0\\noslo.privsep.daemon: privsep process running with capabilities (eff/prm/inh): CAP_SYS_ADMIN/CAP_SYS_ADMIN/none\\noslo.privsep.daemon: privsep daemon running as pid 15944\\nTraceback (most recent call last):\\n File "/usr/libexec/vdsm/managedvolume-helper", line 154, in <module>\\n sys.exit(main(sys.argv[1:]))\\n File "/usr/libexec/vdsm/managedvolume-helper", line 77, in main\\n args.command(args)\\n File "/usr/libexec/vdsm/managedvolume-helper", line 137, in attach\\n attachment = conn.connect_volume(conn_info[\\\'data\\\'])\\n File "/usr/lib/python2.7/site-packages/vdsm/storage/nos_brick.py", line 96, in connect_volume\\n run_as_root=True)\\n File "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52, in _execute\\n result = self.__execute(*args, **kwargs)\\n File "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py", line 169, in execute\\n return execute_root(*cmd, **kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 207, in _wrap\\n return self.channel.remote_call(name, args, kwargs)\\n File "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 202, in remote_call\\n raise exc_type(*result[2])\\noslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.\\nCommand: rbd map volume-36f5eb75-329e-4bd2-88d0-6f0bfe5d1040 --pool ovirt-test --conf /tmp/brickrbd_RmBvxA --id None --mon_host xxx.xxx.216.45:6789 --mon_host xxx.xxx.216.54:6789 --mon_host xxx.xxx.216.55:6789\\nExit code: 22\\nStdout: u\\\'In some cases useful info is found in syslog - try "dmesg | tail".\\\\n\\\'\\nStderr: u"2019-04-01 15:27:30.743196 7fe0b4632d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.None.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory\\\\nrbd: sysfs write failed\\\\n2019-04-01 15:27:30.746987 7fe0b4632d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.None.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory\\\\n2019-04-01 15:27:30.747896 7fe0b4632d40 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication\\\\n2019-04-01 15:27:30.747903 7fe0b4632d40 0 librados: client.None authentication error (95) Operation not supported\\\\nrbd: couldn\\\'t connect to the cluster!\\\\nrbd: map failed: (22) Invalid argument\\\\n"\\n\'',)
I tried to provide a /etc/ceph directory with ceph.conf and client keyring on hypervisors (as configured in driver options). This didn't solve it and doesn't seem to be the right way as the mentioned /tmp/brickrbd_RmBvxA contains the needed keyring data. Please give me some advice what's wrong.
thx Matthias
-- Matthias Leopold IT Systems & Communications Medizinische Universität Wien Spitalgasse 23 / BT 88 /Ebene 00 A-1090 Wien Tel: +43 1 40160-21241 Fax: +43 1 40160-921200
participants (2)
-
Benny Zlotnik
-
Matthias Leopold