[ovirt-users] Re: Latest ManagedBlockDevice documentation

15 Oct 2020


      On 10/15/2020 10:01 AM, Michael Thomas wrote:
...
Getting closer...
I recreated the storage domain and added rbd_default_features=3 to 
ceph.conf.  Now I see the new disk being created with (what I think 
is) the correct set of features:
# rbd info rbd.ovirt.data/volume-f4ac68c6-e5f7-4b01-aed0-36a55b901
fbf
rbd image 'volume-f4ac68c6-e5f7-4b01-aed0-36a55b901fbf':
        size 100 GiB in 25600 objects
        order 22 (4 MiB objects)
        snapshot_count: 0
        id: 70aab541cb331
        block_name_prefix: rbd_data.70aab541cb331
        format: 2
        features: layering
        op_features:
        flags:
        create_timestamp: Thu Oct 15 06:53:23 2020
        access_timestamp: Thu Oct 15 06:53:23 2020
        modify_timestamp: Thu Oct 15 06:53:23 2020
However, I'm still unable to attach the disk to a VM.  This time it's 
a permissions issue on the ovirt node where the VM is running.  It 
looks like it can't read the temporary ceph config file that is sent 
over from the engine:
Are you using octopus?  If so, the config file that's generated is 
missing the "[global]" at the top and octopus doesn't like that.  It's 
been patched upstream.
...
https://pastebin.com/pGjMTvcn
The file '/tmp/brickrbd_nwc3kywk' on the ovirt node is only accessible 
by root:
[root@ovirt4 ~]# ls -l /tmp/brickrbd_nwc3kywk
-rw-------. 1 root root 146 Oct 15 07:25 /tmp/brickrbd_nwc3kywk
...and I'm guessing that it's being accessed by the vdsm user?
--Mike
On 10/14/20 10:59 AM, Michael Thomas wrote:
...
Hi Benny,
You are correct, I tried attaching to a running VM (which failed), then
tried booting a new VM using this disk (which also failed). I'll use
the workaround in the bug report going forward.
I'll just recreate the storage domain, since at this point I have
nothing in it to lose.
Regards,
--Mike
On 10/14/20 9:32 AM, Benny Zlotnik wrote:
...
Did you attempt to start a VM with this disk and it failed, or you
didn't try at all? If it's the latter then the error is strange...
If it's the former there is a known issue with multipath at the
moment, see[1] for a workaround, since you might have issues with
detaching volumes which later, because multipath grabs the rbd devices
which would fail `rbd unmap`, it will be fixed soon by automatically
blacklisting rbd in multipath configuration.
Regarding editing, you can submit an RFE for this, but it is currently
not possible. The options are indeed to either recreate the storage
domain or edit the database table
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1881832#c8
On Wed, Oct 14, 2020 at 3:40 PM Michael Thomas <wart@caltech.edu> 
wrote:
...
On 10/14/20 3:30 AM, Benny Zlotnik wrote:
...
Jeff is right, it's a limitation of kernel rbd, the recommendation is
to add `rbd default features = 3` to the configuration. I think there
are plans to support rbd-nbd in cinderlib which would allow using
additional features, but I'm not aware of anything concrete.
Additionally, the path for the cinderlib log is
/var/log/ovirt-engine/cinderlib/cinderlib.log, the error in this case
would appear in the vdsm.log on the relevant host, and would look
something like "RBD image feature set mismatch. You can disable
features unsupported by the kernel with 'rbd feature disable'"
Thanks for the pointer!  Indeed,
/var/log/ovirt-engine/cinderlib/cinderlib.log has the errors that I 
was
looking for.  In this case, it was a user error entering the RBDDriver
options:
2020-10-13 15:15:25,640 - cinderlib.cinderlib - WARNING - Unknown 
config
option use_multipath_for_xfer
...it should have been 'use_multipath_for_image_xfer'.
Now my attempts to fix it are failing...  If I go to 'Storage -> 
Storage
Domains -> Manage Domain', all driver options are unedittable 
except for
'Name'.
Then I thought that maybe I can't edit the driver options while a disk
still exists, so I tried removing the one disk in this domain.  But 
even
after multiple attempts, it still fails with:
2020-10-14 07:26:31,340 - cinder.volume.drivers.rbd - INFO - volume
volume-5419640e-445f-4b3f-a29d-b316ad031b7a no longer exists in 
backend
2020-10-14 07:26:31,353 - cinderlib-client - ERROR - Failure occurred
when trying to run command 'delete_volume': (psycopg2.IntegrityError)
update or delete on table "volumes" violates foreign key constraint
"volume_attachment_volume_id_fkey" on table "volume_attachment"
DETAIL:  Key (id)=(5419640e-445f-4b3f-a29d-b316ad031b7a) is still
referenced from table "volume_attachment".
See https://pastebin.com/KwN1Vzsp for the full log entries related to
this removal.
It's not lying, the volume no longer exists in the rbd pool, but the
cinder database still thinks it's attached, even though I was never 
able
to get it to attach to a VM.
What are my options for cleaning up this stale disk in the cinder 
database?
How can I update the driver options in my storage domain (deleting and
recreating the domain is acceptable, if possible)?
--Mike
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3WIVWLKS347QKA...
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZZZQJLXDG5O7VV...

[ovirt-users] Re: Latest ManagedBlockDevice documentation

Jeff Bailey