Getting closer...
I recreated the storage domain and added rbd_default_features=3 to
ceph.conf. Now I see the new disk being created with (what I think is)
the correct set of features:
# rbd info rbd.ovirt.data/volume-f4ac68c6-e5f7-4b01-aed0-36a55b901
fbf
rbd image 'volume-f4ac68c6-e5f7-4b01-aed0-36a55b901fbf':
size 100 GiB in 25600 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 70aab541cb331
block_name_prefix: rbd_data.70aab541cb331
format: 2
features: layering
op_features:
flags:
create_timestamp: Thu Oct 15 06:53:23 2020
access_timestamp: Thu Oct 15 06:53:23 2020
modify_timestamp: Thu Oct 15 06:53:23 2020
However, I'm still unable to attach the disk to a VM. This time it's a
permissions issue on the ovirt node where the VM is running. It looks
like it can't read the temporary ceph config file that is sent over from
the engine:
The file '/tmp/brickrbd_nwc3kywk' on the ovirt node is only accessible
by root:
[root@ovirt4 ~]# ls -l /tmp/brickrbd_nwc3kywk
-rw-------. 1 root root 146 Oct 15 07:25 /tmp/brickrbd_nwc3kywk
...and I'm guessing that it's being accessed by the vdsm user?
--Mike
On 10/14/20 10:59 AM, Michael Thomas wrote:
Hi Benny,
You are correct, I tried attaching to a running VM (which failed), then
tried booting a new VM using this disk (which also failed). I'll use
the workaround in the bug report going forward.
I'll just recreate the storage domain, since at this point I have
nothing in it to lose.
Regards,
--Mike
On 10/14/20 9:32 AM, Benny Zlotnik wrote:
> Did you attempt to start a VM with this disk and it failed, or you
> didn't try at all? If it's the latter then the error is strange...
> If it's the former there is a known issue with multipath at the
> moment, see[1] for a workaround, since you might have issues with
> detaching volumes which later, because multipath grabs the rbd devices
> which would fail `rbd unmap`, it will be fixed soon by automatically
> blacklisting rbd in multipath configuration.
>
> Regarding editing, you can submit an RFE for this, but it is currently
> not possible. The options are indeed to either recreate the storage
> domain or edit the database table
>
>
> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1881832#c8
>
>
>
>
> On Wed, Oct 14, 2020 at 3:40 PM Michael Thomas <wart(a)caltech.edu> wrote:
>>
>> On 10/14/20 3:30 AM, Benny Zlotnik wrote:
>>> Jeff is right, it's a limitation of kernel rbd, the recommendation is
>>> to add `rbd default features = 3` to the configuration. I think there
>>> are plans to support rbd-nbd in cinderlib which would allow using
>>> additional features, but I'm not aware of anything concrete.
>>>
>>> Additionally, the path for the cinderlib log is
>>> /var/log/ovirt-engine/cinderlib/cinderlib.log, the error in this case
>>> would appear in the vdsm.log on the relevant host, and would look
>>> something like "RBD image feature set mismatch. You can disable
>>> features unsupported by the kernel with 'rbd feature disable'"
>>
>> Thanks for the pointer! Indeed,
>> /var/log/ovirt-engine/cinderlib/cinderlib.log has the errors that I was
>> looking for. In this case, it was a user error entering the RBDDriver
>> options:
>>
>>
>> 2020-10-13 15:15:25,640 - cinderlib.cinderlib - WARNING - Unknown config
>> option use_multipath_for_xfer
>>
>> ...it should have been 'use_multipath_for_image_xfer'.
>>
>> Now my attempts to fix it are failing... If I go to 'Storage -> Storage
>> Domains -> Manage Domain', all driver options are unedittable except for
>> 'Name'.
>>
>> Then I thought that maybe I can't edit the driver options while a disk
>> still exists, so I tried removing the one disk in this domain. But even
>> after multiple attempts, it still fails with:
>>
>> 2020-10-14 07:26:31,340 - cinder.volume.drivers.rbd - INFO - volume
>> volume-5419640e-445f-4b3f-a29d-b316ad031b7a no longer exists in backend
>> 2020-10-14 07:26:31,353 - cinderlib-client - ERROR - Failure occurred
>> when trying to run command 'delete_volume': (psycopg2.IntegrityError)
>> update or delete on table "volumes" violates foreign key constraint
>> "volume_attachment_volume_id_fkey" on table
"volume_attachment"
>> DETAIL: Key (id)=(5419640e-445f-4b3f-a29d-b316ad031b7a) is still
>> referenced from table "volume_attachment".
>>
>> See
https://pastebin.com/KwN1Vzsp for the full log entries related to
>> this removal.
>>
>> It's not lying, the volume no longer exists in the rbd pool, but the
>> cinder database still thinks it's attached, even though I was never able
>> to get it to attach to a VM.
>>
>> What are my options for cleaning up this stale disk in the cinder database?
>>
>> How can I update the driver options in my storage domain (deleting and
>> recreating the domain is acceptable, if possible)?
>>
>> --Mike
>>
>
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3WIVWLKS347...