Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain

Hi Nir, Do you need any more info from me? I missed the 'sanlock client host_status -D' request, info below: [root@ovirt002 ~]# sanlock client host_status -D lockspace a52938f7-2cf4-4771-acb2-0c78d14999e5 1 timestamp 0 last_check=176740 last_live=205 last_req=0 owner_id=1 owner_generation=5 timestamp=0 io_timeout=10 2 timestamp 176719 last_check=176740 last_live=176740 last_req=0 owner_id=2 owner_generation=7 timestamp=176719 io_timeout=10 250 timestamp 0 last_check=176740 last_live=205 last_req=0 owner_id=250 owner_generation=1 timestamp=0 io_timeout=10 [root@ovirt001 ~]# sanlock client host_status -D [root@ovirt001 ~]# *Steve Dainard * IT Infrastructure Manager Miovision <http://miovision.com/> | *Rethink Traffic* 519-513-2407 ex.250 877-646-8476 (toll-free) *Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/company/miovision-technologies> | Twitter <https://twitter.com/miovision> | Facebook <https://www.facebook.com/miovision>* ------------------------------ Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON, Canada | N2C 1L3 This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately. On Wed, Feb 5, 2014 at 1:39 PM, Steve Dainard <sdainard@miovision.com>wrote:
*Steve Dainard * IT Infrastructure Manager Miovision <http://miovision.com/> | *Rethink Traffic* 519-513-2407 ex.250 877-646-8476 (toll-free)
*Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/company/miovision-technologies> | Twitter <https://twitter.com/miovision> | Facebook <https://www.facebook.com/miovision>* ------------------------------ Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON, Canada | N2C 1L3 This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.
On Wed, Feb 5, 2014 at 10:50 AM, Steve Dainard <sdainard@miovision.com>wrote:
On Tue, Feb 4, 2014 at 6:23 PM, Nir Soffer <nsoffer@redhat.com> wrote:
----- Original Message -----
From: "Steve Dainard" <sdainard@miovision.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "Elad Ben Aharon" <ebenahar@redhat.com>, "users" <users@ovirt.org>, "Aharon Canan" <acanan@redhat.com> Sent: Tuesday, February 4, 2014 10:50:02 PM Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain
Happens every time I try to add a POSIX SD type glusterfs.
Logs attached.
Hi Steve,
I'm afraid we need more history in the logs. I want to see the logs from the time the machine started, until the time of the first error.
No problem. I've rebooted both hosts (manager is installed on host ovirt001). And I've attached all the logs. Note that I put the wrong DNS name 'gluster-rr:/rep2' the first time I created the domain, hence the errors. The POSIX domain created is against 'gluster-store-vip:/rep2'
Note only host ovirt002 is in the POSIX SD cluster.
Sorry this is wrong, it should be ovirt001 is the only host in the POSIX SD cluster.
I've also included the glusterfs log for rep2, with these errors:
[2014-02-05 15:36:28.246203] W [client-rpc-fops.c:873:client3_3_writev_cbk] 0-rep2-client-0: remote operation failed: Invalid argument [2014-02-05 15:36:28.246418] W [client-rpc-fops.c:873:client3_3_writev_cbk] 0-rep2-client-1: remote operation failed: Invalid argument [2014-02-05 15:36:28.246450] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 163: WRITE => -1 (Invalid argument)
We suspect that acquireHostId fails because someone else has aquired the same id. We may see evidence in the logs.
Also can you send the output of this command on the host that fails, and if you have other hosts using the same storage, also on some of these hosts.
sanlock client host_status -D
And finally, can you attach also /var/log/sanlock.log?
Thanks, Nir
Thanks, Steve

----- Original Message -----
From: "Steve Dainard" <sdainard@miovision.com> To: "Nir Soffer" <nsoffer@redhat.com>, "users" <users@ovirt.org> Sent: Friday, February 7, 2014 6:27:04 PM Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain
Hi Nir,
Do you need any more info from me? I missed the 'sanlock client host_status -D' request, info below:
Hi Steve, It looks like your glusterfs mount is not writable by vdsm or sanlock. This can happen when permissions or owner of the directory is not correct. Can you try to mount the glusterfs volume manually, and share the output of ls -lh? sudo mkdir /tmp/gluster sudo mount -t glusterfs 10.0.10.2:/rep2 /tmp/gluster ls -lh /tmp/gluster/ sudo umount /tmp/gluster Thanks, Nir

Hi Nir, [root@ovirt001 storage]# mount -t glusterfs 10.0.10.2:/rep2 rep2-mount/ [root@ovirt001 storage]# ls -lh rep2-mount/ total 0 -rwxr-xr-x. 1 vdsm kvm 0 Feb 5 10:36 __DIRECT_IO_TEST__ drwxr-xr-x. 4 vdsm kvm 32 Feb 5 10:36 ff0e0521-a8fa-4c10-8372-7b67ac3fca31 [root@ovirt001 storage]# ls -lh total 0 drwxr-xr-x. 4 vdsm kvm 91 Jan 30 17:34 iso-mount drwxr-xr-x. 3 root root 23 Jan 30 17:31 lv-iso-domain drwxr-xr-x. 3 vdsm kvm 35 Jan 29 17:43 lv-storage-domain drwxr-xr-x. 3 vdsm kvm 17 Feb 4 15:43 lv-vm-domain drwxr-xr-x. 4 vdsm kvm 91 Feb 5 10:36 rep2-mount *Steve Dainard * IT Infrastructure Manager Miovision <http://miovision.com/> | *Rethink Traffic* 519-513-2407 ex.250 877-646-8476 (toll-free) *Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/company/miovision-technologies> | Twitter <https://twitter.com/miovision> | Facebook <https://www.facebook.com/miovision>* ------------------------------ Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON, Canada | N2C 1L3 This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately. On Sat, Feb 8, 2014 at 5:28 PM, Nir Soffer <nsoffer@redhat.com> wrote:
----- Original Message -----
From: "Steve Dainard" <sdainard@miovision.com> To: "Nir Soffer" <nsoffer@redhat.com>, "users" <users@ovirt.org> Sent: Friday, February 7, 2014 6:27:04 PM Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain
Hi Nir,
Do you need any more info from me? I missed the 'sanlock client host_status -D' request, info below:
Hi Steve,
It looks like your glusterfs mount is not writable by vdsm or sanlock. This can happen when permissions or owner of the directory is not correct.
Can you try to mount the glusterfs volume manually, and share the output of ls -lh?
sudo mkdir /tmp/gluster sudo mount -t glusterfs 10.0.10.2:/rep2 /tmp/gluster ls -lh /tmp/gluster/ sudo umount /tmp/gluster
Thanks, Nir

----- Original Message -----
From: "Steve Dainard" <sdainard@miovision.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "users" <users@ovirt.org> Sent: Sunday, February 9, 2014 3:51:03 AM Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain
Hi Nir,
[root@ovirt001 storage]# mount -t glusterfs 10.0.10.2:/rep2 rep2-mount/ [root@ovirt001 storage]# ls -lh rep2-mount/ total 0 -rwxr-xr-x. 1 vdsm kvm 0 Feb 5 10:36 __DIRECT_IO_TEST__ drwxr-xr-x. 4 vdsm kvm 32 Feb 5 10:36 ff0e0521-a8fa-4c10-8372-7b67ac3fca31 [root@ovirt001 storage]# ls -lh total 0 drwxr-xr-x. 4 vdsm kvm 91 Jan 30 17:34 iso-mount drwxr-xr-x. 3 root root 23 Jan 30 17:31 lv-iso-domain drwxr-xr-x. 3 vdsm kvm 35 Jan 29 17:43 lv-storage-domain drwxr-xr-x. 3 vdsm kvm 17 Feb 4 15:43 lv-vm-domain drwxr-xr-x. 4 vdsm kvm 91 Feb 5 10:36 rep2-mount
Looks good. Can you write into rep2-mount? Please try: sudo -u vdsm dd if=/dev/zero of=rep2-mount/__test__ bs=1M count=1 oflag=direct rm rep2-mount/__test__ Thanks, Nir

vdsm can write to it: [root@ovirt001 rep2-mount]# sudo -u vdsm dd if=/dev/zero of=__test__ bs=1M count=1 oflag=direct 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.12691 s, 8.3 MB/s [root@ovirt001 rep2-mount]# pwd /mnt/storage/rep2-mount [root@ovirt001 rep2-mount]# mount /dev/md125p2 on / type ext4 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0") /dev/md125p1 on /boot type ext4 (rw) /dev/mapper/gluster-storage--domain on /mnt/storage/lv-storage-domain type xfs (rw) /dev/mapper/gluster-iso--domain on /mnt/storage/lv-iso-domain type xfs (rw) /dev/mapper/gluster-vm--domain on /mnt/storage/lv-vm-domain type xfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) 10.0.10.2:/iso-store on /mnt/storage/iso-mount type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072) 10.0.10.2:/rep2 on /mnt/storage/rep2-mount type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072) [root@ovirt001 rep2-mount]# gluster volume info rep2 Volume Name: rep2 Type: Replicate Volume ID: b89a21bb-5ad1-493f-b197-8f990ab3ba77 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.0.10.2:/mnt/storage/lv-vm-domain/rep2 Brick2: 10.0.10.3:/mnt/storage/lv-vm-domain/rep2 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 server.allow-insecure: on cluster.quorum-type: auto Thanks, *Steve Dainard * IT Infrastructure Manager Miovision <http://miovision.com/> | *Rethink Traffic* 519-513-2407 ex.250 877-646-8476 (toll-free) *Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/company/miovision-technologies> | Twitter <https://twitter.com/miovision> | Facebook <https://www.facebook.com/miovision>* ------------------------------ Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON, Canada | N2C 1L3 This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately. On Sun, Feb 9, 2014 at 1:30 AM, Nir Soffer <nsoffer@redhat.com> wrote:
----- Original Message -----
From: "Steve Dainard" <sdainard@miovision.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "users" <users@ovirt.org> Sent: Sunday, February 9, 2014 3:51:03 AM Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain
Hi Nir,
[root@ovirt001 storage]# mount -t glusterfs 10.0.10.2:/rep2 rep2-mount/ [root@ovirt001 storage]# ls -lh rep2-mount/ total 0 -rwxr-xr-x. 1 vdsm kvm 0 Feb 5 10:36 __DIRECT_IO_TEST__ drwxr-xr-x. 4 vdsm kvm 32 Feb 5 10:36 ff0e0521-a8fa-4c10-8372-7b67ac3fca31 [root@ovirt001 storage]# ls -lh total 0 drwxr-xr-x. 4 vdsm kvm 91 Jan 30 17:34 iso-mount drwxr-xr-x. 3 root root 23 Jan 30 17:31 lv-iso-domain drwxr-xr-x. 3 vdsm kvm 35 Jan 29 17:43 lv-storage-domain drwxr-xr-x. 3 vdsm kvm 17 Feb 4 15:43 lv-vm-domain drwxr-xr-x. 4 vdsm kvm 91 Feb 5 10:36 rep2-mount
Looks good.
Can you write into rep2-mount?
Please try:
sudo -u vdsm dd if=/dev/zero of=rep2-mount/__test__ bs=1M count=1 oflag=direct rm rep2-mount/__test__
Thanks, Nir

----- Original Message -----
From: "Steve Dainard" <sdainard@miovision.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "users" <users@ovirt.org> Sent: Sunday, February 9, 2014 5:39:14 PM Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain
vdsm can write to it:
Hi Steve, There are two issues in your logs: sanlock: 2014-02-05 09:56:24-0500 40 [5093]: sanlock daemon started 2.8 host 5d4fe362-e06e-40a8-b33a-f72dab971226.ovirt001.m 2014-02-05 10:06:39-0500 655 [5098]: s1 lockspace 020f3d8d-55f0-4552-8596-a9452227d4db:250:/rhev/data-center/mnt/10.0.10.2:_rep2/020f3d8d-55f0-4552-8596-a9452227d4db/dom_md/ids:0 2014-02-05 10:06:39-0500 655 [7358]: 020f3d8d aio collect 1 0x7f351c0008c0:0x7f351c0008d0:0x7f351c001000 result -22:0 match res 2014-02-05 10:06:39-0500 655 [7358]: write_sectors delta_leader offset 127488 rv -22 /rhev/data-center/mnt/10.0.10.2:_rep2/020f3d8d-55f0-4552-8596-a9452227d4db/dom_md/ids 2014-02-05 10:06:40-0500 656 [5098]: s1 add_lockspace fail result -22 To diagnose this error, we would like to have sanlock logs using debug level. Please do this: 1. Edit /etc/sysconfig/sanlock and add: # -L 7: use debug level logging to sanlock log file SANLOCKOPTS="$SANLOCKOPTS -L 7" 2. Reboot the host 3. Try to add the gluster domain again 4. Send again vdsm.log, engine.log, sanlock.log and gluster log from the host with the issue. glusterfs: [2014-02-05 15:36:28.230837] I [afr-self-heal-data.c:655:afr_sh_data_fix] 0-rep2-replicate-0: no active sinks for performing self-heal on file /ff0e0521-a8fa-4c10-8372-7b67ac3fca31/dom_md/ids [2014-02-05 15:36:28.246203] W [client-rpc-fops.c:873:client3_3_writev_cbk] 0-rep2-client-0: remote operation failed: Invalid argument [2014-02-05 15:36:28.246418] W [client-rpc-fops.c:873:client3_3_writev_cbk] 0-rep2-client-1: remote operation failed: Invalid argument [2014-02-05 15:36:28.246450] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 163: WRITE => -1 (Invalid argument) For this error, it would be best to consult with glusterfs folks. Thanks, Nir

Enabled logging, logs attached. Thanks, *Steve Dainard * IT Infrastructure Manager Miovision <http://miovision.com/> | *Rethink Traffic* *Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/company/miovision-technologies> | Twitter <https://twitter.com/miovision> | Facebook <https://www.facebook.com/miovision>* ------------------------------ Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON, Canada | N2C 1L3 This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately. On Tue, Feb 11, 2014 at 7:39 AM, Nir Soffer <nsoffer@redhat.com> wrote:
----- Original Message -----
From: "Steve Dainard" <sdainard@miovision.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "users" <users@ovirt.org> Sent: Sunday, February 9, 2014 5:39:14 PM Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain
vdsm can write to it:
Hi Steve,
There are two issues in your logs:
sanlock:
2014-02-05 09:56:24-0500 40 [5093]: sanlock daemon started 2.8 host 5d4fe362-e06e-40a8-b33a-f72dab971226.ovirt001.m 2014-02-05 10:06:39-0500 655 [5098]: s1 lockspace 020f3d8d-55f0-4552-8596-a9452227d4db:250:/rhev/data-center/mnt/10.0.10.2: _rep2/020f3d8d-55f0-4552-8596-a9452227d4db/dom_md/ids:0 2014-02-05 10:06:39-0500 655 [7358]: 020f3d8d aio collect 1 0x7f351c0008c0:0x7f351c0008d0:0x7f351c001000 result -22:0 match res 2014-02-05 10:06:39-0500 655 [7358]: write_sectors delta_leader offset 127488 rv -22 /rhev/data-center/mnt/10.0.10.2: _rep2/020f3d8d-55f0-4552-8596-a9452227d4db/dom_md/ids 2014-02-05 10:06:40-0500 656 [5098]: s1 add_lockspace fail result -22
To diagnose this error, we would like to have sanlock logs using debug level.
Please do this:
1. Edit /etc/sysconfig/sanlock and add:
# -L 7: use debug level logging to sanlock log file SANLOCKOPTS="$SANLOCKOPTS -L 7"
2. Reboot the host 3. Try to add the gluster domain again 4. Send again vdsm.log, engine.log, sanlock.log and gluster log from the host with the issue.
glusterfs:
[2014-02-05 15:36:28.230837] I [afr-self-heal-data.c:655:afr_sh_data_fix] 0-rep2-replicate-0: no active sinks for performing self-heal on file /ff0e0521-a8fa-4c10-8372-7b67ac3fca31/dom_md/ids [2014-02-05 15:36:28.246203] W [client-rpc-fops.c:873:client3_3_writev_cbk] 0-rep2-client-0: remote operation failed: Invalid argument [2014-02-05 15:36:28.246418] W [client-rpc-fops.c:873:client3_3_writev_cbk] 0-rep2-client-1: remote operation failed: Invalid argument [2014-02-05 15:36:28.246450] W [fuse-bridge.c:2167:fuse_writev_cbk] 0-glusterfs-fuse: 163: WRITE => -1 (Invalid argument)
For this error, it would be best to consult with glusterfs folks.
Thanks, Nir

----- Original Message -----
From: "Steve Dainard" <sdainard@miovision.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "users" <users@ovirt.org> Sent: Tuesday, February 11, 2014 7:42:37 PM Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain
Enabled logging, logs attached.
According to sanlock and gluster log: 1. on the host, sanlock is failing write to the ids volume 2. on the gluster side we see failure to heal the ids file. This looks like glusterfs issue, and should be handled by glusterfs folks. You probably should configure sanlock log level back to the default by commenting out the configuration I suggested in the previous mail. According to gluster configuration in this log, this looks like 2 replicas with auto quorum. This setup is not recommended because both machines must be up all the time. When one machine is down, your entire storage is down. Check this post explaining this issue: http://lists.ovirt.org/pipermail/users/2014-February/021541.html Thanks, Nir

Hi Nir, I have a thread open on the gluster side about heal-failed operations, so I'll wait for a response on that side. Agreed on two node quorum, I'm waiting for a 3rd node right now :) But in the meantime or for anyone who reads this thread, if you only have 2 storage nodes you have to weigh the risks of 2 nodes in quorum ensuring storage consistency, or 2 nodes no quorum with an extra shot at uptime. *Steve Dainard * IT Infrastructure Manager Miovision <http://miovision.com/> | *Rethink Traffic* *Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/company/miovision-technologies> | Twitter <https://twitter.com/miovision> | Facebook <https://www.facebook.com/miovision>* ------------------------------ Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON, Canada | N2C 1L3 This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately. On Wed, Feb 19, 2014 at 4:13 AM, Nir Soffer <nsoffer@redhat.com> wrote:
----- Original Message -----
From: "Steve Dainard" <sdainard@miovision.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "users" <users@ovirt.org> Sent: Tuesday, February 11, 2014 7:42:37 PM Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain
Enabled logging, logs attached.
According to sanlock and gluster log:
1. on the host, sanlock is failing write to the ids volume 2. on the gluster side we see failure to heal the ids file.
This looks like glusterfs issue, and should be handled by glusterfs folks.
You probably should configure sanlock log level back to the default by commenting out the configuration I suggested in the previous mail.
According to gluster configuration in this log, this looks like 2 replicas with auto quorum. This setup is not recommended because both machines must be up all the time. When one machine is down, your entire storage is down.
Check this post explaining this issue: http://lists.ovirt.org/pipermail/users/2014-February/021541.html
Thanks, Nir
participants (2)
-
Nir Soffer
-
Steve Dainard