Hi,
Sorry for my delayed response. I've been away on a holiday.
Around the time of the frozen VMs I see lots and lots and lots of "Transport endpoint
is not connected"
messages in the
'rhev-data-center-mnt-glusterSD-vmhost1.local:_vmstore1.log-20160711' log file,
like:
[2016-07-06 08:13:08.297203] E [MSGID: 114031] [client-rpc-fops.c:972:client3_3_flush_cbk]
0-vmstore1-client-2: remote operation failed [Transport endpoint is not connected]
[2016-07-06 08:13:08.298066] W [MSGID: 114031]
[client-rpc-fops.c:845:client3_3_statfs_cbk] 0-vmstore1-client-2: remote operation failed
[Transport endpoint is not connected]
[2016-07-06 08:13:08.299478] W [MSGID: 114031]
[client-rpc-fops.c:2974:client3_3_lookup_cbk] 0-vmstore1-client-2: remote operation
failed. Path: /9056e6a8-105f-4c63-bfc1-848f674a942a/images
(96ff2aa1-6ade-48e5-933d-d18680c29913) [Transport endpoint is not connected]
[2016-07-06 08:13:08.300435] W [MSGID: 114031]
[client-rpc-fops.c:2974:client3_3_lookup_cbk] 0-vmstore1-client-2: remote operation
failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Transport endpoint is not
connected]
[2016-07-06 08:13:08.300939] E [MSGID: 114031]
[client-rpc-fops.c:1730:client3_3_entrylk_cbk] 0-vmstore1-client-2: remote operation
failed [Transport endpoint is not connected]
[2016-07-06 08:13:08.301351] W [MSGID: 114031]
[client-rpc-fops.c:2974:client3_3_lookup_cbk] 0-vmstore1-client-2: remote operation
failed. Path: <gfid:e38729d9-320e-4230-acab-f363cf48089e>
(e38729d9-320e-4230-acab-f363cf48089e) [Transport endpoint is not connected]
[2016-07-06 08:13:08.303396] W [MSGID: 114031]
[client-rpc-fops.c:2974:client3_3_lookup_cbk] 0-vmstore1-client-2: remote operation
failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Transport endpoint is not
connected]
[2016-07-06 08:13:08.303954] E [MSGID: 114031]
[client-rpc-fops.c:2886:client3_3_opendir_cbk] 0-vmstore1-client-2: remote operation
failed. Path: /9056e6a8-105f-4c63-bfc1-848f674a942a/images
(96ff2aa1-6ade-48e5-933d-d18680c29913) [Transport endpoint is not connected]
[2016-07-06 08:13:08.306293] W [MSGID: 114031]
[client-rpc-fops.c:845:client3_3_statfs_cbk] 0-vmstore1-client-2: remote operation failed
[Transport endpoint is not connected]
[2016-07-06 08:13:08.314599] W [MSGID: 114031]
[client-rpc-fops.c:845:client3_3_statfs_cbk] 0-vmstore1-client-2: remote operation failed
[Transport endpoint is not connected]
(repeated thousands of times)
Together with:
[2016-07-06 08:16:49.233677] W [socket.c:589:__socket_rwv] 0-glusterfs: readv on
10.0.0.153:24007 failed (No data available)
[2016-07-06 08:16:50.326447] W [socket.c:589:__socket_rwv] 0-vmstore1-client-0: readv on
10.0.0.153:49152 failed (No data available)
[2016-07-06 08:16:50.326492] I [MSGID: 114018] [client.c:2030:client_rpc_notify]
0-vmstore1-client-0: disconnected from vmstore1-client-0. Client process will keep trying
to connect to glusterd until brick's port is available
[2016-07-06 08:16:51.523632] W [fuse-bridge.c:2302:fuse_writev_cbk] 0-glusterfs-fuse:
33314276: WRITE => -1 (Read-only file system)
[2016-07-06 08:16:51.523890] W [MSGID: 114061] [client-rpc-fops.c:4449:client3_3_flush]
0-vmstore1-client-2: (54da8812-7f5e-48e1-87a8-4f7f17e44918) remote_fd is -1. EBADFD [File
descriptor in bad state]
[2016-07-06 08:16:59.575848] E [socket.c:2279:socket_connect_finish] 0-glusterfs:
connection to 10.0.0.153:24007 failed (Connection refused)
[2016-07-06 08:17:00.578271] E [socket.c:2279:socket_connect_finish] 0-vmstore1-client-0:
connection to 10.0.0.153:24007 failed (Connection refused)
[2016-07-06 08:20:42.236021] I [fuse-bridge.c:4997:fuse_thread_proc] 0-fuse: unmounting
/rhev/data-center/mnt/glusterSD/vmhost1.local:_vmstore1
[2016-07-06 08:20:42.236230] W [glusterfsd.c:1251:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7dc5) [0x7f94b8ba1dc5]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f94ba20c8b5]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x69) [0x7f94ba20c739] ) 0-: received signum
(15), shutting down
[2016-07-06 08:20:42.236271] I [fuse-bridge.c:5704:fini] 0-fuse: Unmounting
'/rhev/data-center/mnt/glusterSD/vmhost1.local:_vmstore1'.
[2016-07-06 08:30:09.594553] I [MSGID: 100030] [glusterfsd.c:2338:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.12 (args:
/usr/sbin/glusterfs --volfile-server=vmhost1.local --volfile-server=10.0.0.153
--volfile-server=10.0.0.160 --volfile-server=10.0.0.198 --volfile-id=/vmstore1
/rhev/data-center/mnt/glusterSD/vmhost1.local:_vmstore1)
After the last line ("Started running /usr/sbin/glusterfs version 3.7.12")
everything went back to normal.
(I replaced the hostnames and IP-addresses).
Is this the information you need?
Thanks in advance!
Regards,
Bertjan
On Mon, Jul 11, 2016 at 01:49:19PM +0530, Sahina Bose wrote:
Did you see any errors in the gluster mount logs during the time when
the
VMs were frozen ( I assume the I/O not responding during this time?) . There
have been bugs fixed around concurrent I/O on gluster volume and vm's
pausing in 3.7.12 - the mount logs can tell us if you ran into similar
issue.
On 07/08/2016 03:58 PM, bertjan wrote:
> Hi Michal,
>
> That's right. I put it in maintenance mode, so there were no VMs.
>
> The frozen VMs were on the other hosts. That's wat makes it strange and
> why it doesn't give me a good feeling. When someone can say 'I know the
> issue and it is fixed with gluster version 3.7.12', I would feel more reassured
> about it...
>
> Regards,
>
> Bertjan
>
> On Fri, Jul 08, 2016 at 12:23:21PM +0200, Michal Skrivanek wrote:
> > > On 08 Jul 2016, at 12:06, bertjan <b.j.goorkate(a)umcutrecht.nl>
wrote:
> > >
> > > Hi,
> > >
> > > I have a 3-node CentOS7 based oVirt+replica-3 gluster environment with an
engine
> > > on dedicated hardware.
> > >
> > > After putting the first vm-host into maintenance mode to update it from
vdsm-4.17.28-1
> > > to vdsm-4.17.32-0 and from glusterfs-3.7.11-1 to glusterfs-3.7.12-2 (among
others),
> > > random VMs froze (not pauzed. oVirt showed them as 'up') until the
update was done and
> > > the vm-host was rebooted and active again.
> > I suppose the host you were updating at that time had no running VMs, right?
> > If so, then indeed perhaps a gluster issue
> >
> > > After all the vm-hosts were upgraded, I never experienced the problem
again.
> > > Can this be a bug, fixed with the upgrade to glusterfs-3.7.12-2?
> > >
> > > Has anyone experienced the same problem?
> > >
> > > Thanks in advance! (next week I'm not able to check my e-mail, so
response can be delayed).
> > >
> > > Regards,
> > >
> > > Bertjan
> > >
> > >
> > >
> > >
------------------------------------------------------------------------------
> > >
> > > De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is
> > > uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht
> > > ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender
direct
> > > te informeren door het bericht te retourneren. Het Universitair Medisch
> > > Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de
W.H.W.
> > > (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd
bij
> > > de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.
> > >
> > > Denk s.v.p aan het milieu voor u deze e-mail afdrukt.
> > >
> > >
------------------------------------------------------------------------------
> > >
> > > This message may contain confidential information and is intended
exclusively
> > > for the addressee. If you receive this message unintentionally, please do
not
> > > use the contents but notify the sender immediately by return e-mail.
University
> > > Medical Center Utrecht is a legal person by public law and is registered
at
> > > the Chamber of Commerce for Midden-Nederland under no. 30244197.
> > >
> > > Please consider the environment before printing this e-mail.
> > > _______________________________________________
> > > Users mailing list
> > > Users(a)ovirt.org
> > >
http://lists.ovirt.org/mailman/listinfo/users
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users