[Users] SD Disk's Logical Volume not visible/activated on some nodes
Nir Soffer
nsoffer at redhat.com
Sat Feb 22 15:19:43 EST 2014
----- Original Message -----
> From: "Boyan Tabakov" <blade at alslayer.net>
> To: "Nir Soffer" <nsoffer at redhat.com>
> Cc: users at ovirt.org
> Sent: Wednesday, February 19, 2014 7:18:36 PM
> Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes
>
> Hello,
>
> On 19.2.2014, 17:09, Nir Soffer wrote:
> > ----- Original Message -----
> >> From: "Boyan Tabakov" <blade at alslayer.net>
> >> To: users at ovirt.org
> >> Sent: Tuesday, February 18, 2014 3:34:49 PM
> >> Subject: [Users] SD Disk's Logical Volume not visible/activated on some
> >> nodes
> >>
> >> Hello,
> >>
> >> I have ovirt 3.3 installed on on FC 19 hosts with vdsm 4.13.3-2.fc19.
> >
> > Which version of ovirt 3.3 is this? (3.3.2? 3.3.3?)
>
> ovirt-engine is 3.3.2-1.fc19
>
> >> One of the hosts (host1) is engine + node + SPM and the other host2 is
> >> just a node. I have an iSCSI storage domain configured and accessible
> >> from both nodes.
> >>
> >> When creating a new disk in the SD, the underlying logical volume gets
> >> properly created (seen in vgdisplay output on host1), but doesn't seem
> >> to be automatically picked by host2.
> >
> > How do you know it is not seen on host2?
>
> It's not present in the output of vgdisplay -v nor vgs.
>
> >
> >> Consequently, when creating/booting
> >> a VM with the said disk attached, the VM fails to start on host2,
> >> because host2 can't see the LV. Similarly, if the VM is started on
> >> host1, it fails to migrate to host2. Extract from host2 log is in the
> >> end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.
> >>
> >> As far as I could track quickly the vdsm code, there is only call to lvs
> >> and not to lvscan or lvchange so the host2 LVM doesn't fully refresh.
> >> The only workaround so far has been to restart VDSM on host2, which
> >> makes it refresh all LVM data properly.
> >>
> >> When is host2 supposed to pick up any newly created LVs in the SD VG?
> >> Any suggestions where the problem might be?
> >
> > When you create a new lv on the shared storage, the new lv should be
> > visible on the other host. Lets start by verifying that you do see
> > the new lv after a disk was created.
> >
> > Try this:
> >
> > 1. Create a new disk, and check the disk uuid in the engine ui
> > 2. On another machine, run this command:
> >
> > lvs -o vg_name,lv_name,tags
> >
> > You can identify the new lv using tags, which should contain the new disk
> > uuid.
> >
> > If you don't see the new lv from the other host, please provide
> > /var/log/messages
> > and /var/log/sanlock.log.
>
> Just tried that. The disk is not visible on the non-SPM node.
This means that storage is not accessible from this host.
>
> On the SPM node (where the LV is visible in vgs output):
>
> Feb 19 19:10:43 host1 vdsm root WARNING File:
> /rhev/data-center/61f15cc0-8bba-482d-8a81-cd636a581b58/3307f6fa-dd58-43db-ab23-b1fb299006c7/images/4d15543c-4c45-4c23-bbe3-f10b9084472a/3e0ce8cb-3740-49d7-908e-d025875ac9a2
> already removed
> Feb 19 19:10:45 host1 multipathd: dm-65: remove map (uevent)
> Feb 19 19:10:45 host1 multipathd: dm-65: devmap not registered, can't remove
> Feb 19 19:10:45 host1 multipathd: dm-65: remove map (uevent)
> Feb 19 19:10:54 host1 kernel: [1652684.864746] dd: sending ioctl
> 80306d02 to a partition!
> Feb 19 19:10:54 host1 kernel: [1652684.963931] dd: sending ioctl
> 80306d02 to a partition!
>
> No recent entries in sanlock.log on the SPM node.
>
> On the non-SPM node (the one that doesn't show the LV in vgs output),
> there are no relevant entries in /var/log/messages.
Strange - sanlock errors are logged to /var/log/messages. It would be helpful if
you attach this log - we may find something in it.
> Here's the full
> sanlock.log for that host:
>
> 2014-01-30 16:28:09+0200 1324 [2335]: sanlock daemon started 2.8 host
> 18bd0a27-c280-4007-98f2-d2e7e73cd8b5.xenon.futu
> 2014-01-30 16:59:51+0200 5 [609]: sanlock daemon started 2.8 host
> 4a7627e2-296a-4e48-a7e2-f6bcecac07ab.xenon.futu
> 2014-01-31 09:51:43+0200 60717 [614]: s1 lockspace
> 3307f6fa-dd58-43db-ab23-b1fb299006c7:2:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids:0
> 2014-01-31 16:03:51+0200 83045 [613]: s1:r1 resource
> 3307f6fa-dd58-43db-ab23-b1fb299006c7:SDM:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/leases:1048576
> for 8,16,30268
> 2014-01-31 16:18:01+0200 83896 [614]: s1:r2 resource
> 3307f6fa-dd58-43db-ab23-b1fb299006c7:SDM:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/leases:1048576
> for 8,16,30268
> 2014-02-06 05:24:10+0200 563065 [31453]: 3307f6fa aio timeout 0
> 0x7fc37c0008c0:0x7fc37c0008d0:0x7fc391f5f000 ioto 10 to_count 1
> 2014-02-06 05:24:10+0200 563065 [31453]: s1 delta_renew read rv -202
> offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
Sanlock cannot write to the ids lockspace
> 2014-02-06 05:24:10+0200 563065 [31453]: s1 renewal error -202
> delta_length 10 last_success 563034
> 2014-02-06 05:24:21+0200 563076 [31453]: 3307f6fa aio timeout 0
> 0x7fc37c000910:0x7fc37c000920:0x7fc391d5c000 ioto 10 to_count 2
> 2014-02-06 05:24:21+0200 563076 [31453]: s1 delta_renew read rv -202
> offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
> 2014-02-06 05:24:21+0200 563076 [31453]: s1 renewal error -202
> delta_length 11 last_success 563034
> 2014-02-06 05:24:32+0200 563087 [31453]: 3307f6fa aio timeout 0
> 0x7fc37c000960:0x7fc37c000970:0x7fc391c5a000 ioto 10 to_count 3
> 2014-02-06 05:24:32+0200 563087 [31453]: s1 delta_renew read rv -202
> offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
> 2014-02-06 05:24:32+0200 563087 [31453]: s1 renewal error -202
> delta_length 11 last_success 563034
> 2014-02-06 05:24:40+0200 563094 [609]: s1 check_our_lease warning 60
> last_success 563034
> 2014-02-06 05:24:41+0200 563095 [609]: s1 check_our_lease warning 61
> last_success 563034
> 2014-02-06 05:24:42+0200 563096 [609]: s1 check_our_lease warning 62
> last_success 563034
> 2014-02-06 05:24:43+0200 563097 [609]: s1 check_our_lease warning 63
> last_success 563034
> 2014-02-06 05:24:43+0200 563098 [31453]: 3307f6fa aio timeout 0
> 0x7fc37c0009b0:0x7fc37c0009c0:0x7fc391b58000 ioto 10 to_count 4
> 2014-02-06 05:24:43+0200 563098 [31453]: s1 delta_renew read rv -202
> offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
> 2014-02-06 05:24:43+0200 563098 [31453]: s1 renewal error -202
> delta_length 11 last_success 563034
> 2014-02-06 05:24:44+0200 563098 [609]: s1 check_our_lease warning 64
> last_success 563034
> 2014-02-06 05:24:45+0200 563099 [609]: s1 check_our_lease warning 65
> last_success 563034
> 2014-02-06 05:24:46+0200 563100 [609]: s1 check_our_lease warning 66
> last_success 563034
> 2014-02-06 05:24:47+0200 563101 [609]: s1 check_our_lease warning 67
> last_success 563034
> 2014-02-06 05:24:48+0200 563102 [609]: s1 check_our_lease warning 68
> last_success 563034
> 2014-02-06 05:24:49+0200 563103 [609]: s1 check_our_lease warning 69
> last_success 563034
> 2014-02-06 05:24:50+0200 563104 [609]: s1 check_our_lease warning 70
> last_success 563034
> 2014-02-06 05:24:51+0200 563105 [609]: s1 check_our_lease warning 71
> last_success 563034
> 2014-02-06 05:24:52+0200 563106 [609]: s1 check_our_lease warning 72
> last_success 563034
> 2014-02-06 05:24:53+0200 563107 [609]: s1 check_our_lease warning 73
> last_success 563034
> 2014-02-06 05:24:54+0200 563108 [609]: s1 check_our_lease warning 74
> last_success 563034
> 2014-02-06 05:24:54+0200 563109 [31453]: s1 delta_renew read rv -2
> offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
> 2014-02-06 05:24:54+0200 563109 [31453]: s1 renewal error -2
> delta_length 11 last_success 563034
> 2014-02-06 05:24:55+0200 563109 [609]: s1 check_our_lease warning 75
> last_success 563034
> 2014-02-06 05:24:56+0200 563110 [609]: s1 check_our_lease warning 76
> last_success 563034
> 2014-02-06 05:24:57+0200 563111 [609]: s1 check_our_lease warning 77
> last_success 563034
> 2014-02-06 05:24:58+0200 563112 [609]: s1 check_our_lease warning 78
> last_success 563034
> 2014-02-06 05:24:59+0200 563113 [609]: s1 check_our_lease warning 79
> last_success 563034
> 2014-02-06 05:25:00+0200 563114 [609]: s1 check_our_lease failed 80
> 2014-02-06 05:25:00+0200 563114 [609]: s1 all pids clear
> 2014-02-06 05:25:05+0200 563119 [31453]: s1 delta_renew read rv -2
> offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
> 2014-02-06 05:25:05+0200 563119 [31453]: s1 renewal error -2
> delta_length 10 last_success 563034
> 2014-02-06 05:25:15+0200 563129 [31453]: 3307f6fa close_task_aio 0
> 0x7fc37c0008c0 busy
> 2014-02-06 05:25:15+0200 563129 [31453]: 3307f6fa close_task_aio 1
> 0x7fc37c000910 busy
> 2014-02-06 05:25:15+0200 563129 [31453]: 3307f6fa close_task_aio 2
> 0x7fc37c000960 busy
> 2014-02-06 05:25:15+0200 563129 [31453]: 3307f6fa close_task_aio 3
> 0x7fc37c0009b0 busy
> 2014-02-06 05:25:25+0200 563139 [31453]: 3307f6fa close_task_aio 0
> 0x7fc37c0008c0 busy
> 2014-02-06 05:25:25+0200 563139 [31453]: 3307f6fa close_task_aio 1
> 0x7fc37c000910 busy
> 2014-02-06 05:25:25+0200 563139 [31453]: 3307f6fa close_task_aio 2
> 0x7fc37c000960 busy
> 2014-02-06 05:25:25+0200 563139 [31453]: 3307f6fa close_task_aio 3
> 0x7fc37c0009b0 busy
> 2014-02-06 05:25:35+0200 563149 [31453]: 3307f6fa close_task_aio 0
> 0x7fc37c0008c0 busy
> 2014-02-06 05:25:35+0200 563149 [31453]: 3307f6fa close_task_aio 1
> 0x7fc37c000910 busy
> 2014-02-06 05:25:35+0200 563149 [31453]: 3307f6fa close_task_aio 2
> 0x7fc37c000960 busy
> 2014-02-06 05:25:35+0200 563149 [31453]: 3307f6fa close_task_aio 3
> 0x7fc37c0009b0 busy
> 2014-02-06 05:25:45+0200 563159 [31453]: 3307f6fa close_task_aio 0
> 0x7fc37c0008c0 busy
> 2014-02-06 05:25:45+0200 563159 [31453]: 3307f6fa close_task_aio 1
> 0x7fc37c000910 busy
> 2014-02-06 05:25:45+0200 563159 [31453]: 3307f6fa close_task_aio 2
> 0x7fc37c000960 busy
> 2014-02-06 05:25:45+0200 563159 [31453]: 3307f6fa close_task_aio 3
> 0x7fc37c0009b0 busy
> 2014-02-06 05:25:55+0200 563169 [31453]: 3307f6fa close_task_aio 0
> 0x7fc37c0008c0 busy
> 2014-02-06 05:25:55+0200 563169 [31453]: 3307f6fa close_task_aio 1
> 0x7fc37c000910 busy
> 2014-02-06 05:25:55+0200 563169 [31453]: 3307f6fa close_task_aio 2
> 0x7fc37c000960 busy
> 2014-02-06 05:25:55+0200 563169 [31453]: 3307f6fa close_task_aio 3
> 0x7fc37c0009b0 busy
> 2014-02-06 05:25:59+0200 563173 [31453]: 3307f6fa aio collect
> 0x7fc37c0008c0:0x7fc37c0008d0:0x7fc391f5f000 result 1048576:0 close free
> 2014-02-06 05:25:59+0200 563173 [31453]: 3307f6fa aio collect
> 0x7fc37c000910:0x7fc37c000920:0x7fc391d5c000 result 1048576:0 close free
> 2014-02-06 05:25:59+0200 563173 [31453]: 3307f6fa aio collect
> 0x7fc37c000960:0x7fc37c000970:0x7fc391c5a000 result 1048576:0 close free
> 2014-02-06 05:25:59+0200 563173 [31453]: 3307f6fa aio collect
> 0x7fc37c0009b0:0x7fc37c0009c0:0x7fc391b58000 result 1048576:0 close free
> 2014-02-06 05:26:09+0200 563183 [614]: s2 lockspace
> 3307f6fa-dd58-43db-ab23-b1fb299006c7:2:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids:0
Sanlock can access the storage again - did you change something around this time?
> 2014-02-18 14:22:16+0200 1632150 [614]: s3 lockspace
> 3307f6fa-dd58-43db-ab23-b1fb299006c7:2:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids:0
> 2014-02-18 14:38:16+0200 1633111 [614]: s4 lockspace
> 3307f6fa-dd58-43db-ab23-b1fb299006c7:2:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids:0
> 2014-02-18 16:13:07+0200 1638801 [613]: s5 lockspace
> 3307f6fa-dd58-43db-ab23-b1fb299006c7:2:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids:0
> 2014-02-18 19:28:09+0200 1650504 [614]: s6 lockspace
> 3307f6fa-dd58-43db-ab23-b1fb299006c7:2:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids:0
>
> Last entry is from yesterday, while I just created a new disk.
What was the status of this host in the engine from 2014-02-06 05:24:10+0200 to 2014-02-18 14:22:16?
vdsm.log and engine.log for this time frame will make it more clear.
Nir
More information about the Users
mailing list