Re: [ovirt-users] Problems with some vms

One brick was at a point down for replacement. It has been replaced and all vols are up Status of volume: data Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick ovirt0:/gluster/brick3/data 49152 0 Y 22467 Brick ovirt2:/gluster/brick3/data 49152 0 Y 20736 Brick ovirt3:/gluster/brick3/data 49152 0 Y 23148 Brick ovirt0:/gluster/brick4/data 49153 0 Y 22497 Brick ovirt2:/gluster/brick4/data 49153 0 Y 20742 Brick ovirt3:/gluster/brick4/data 49153 0 Y 23158 Brick ovirt0:/gluster/brick5/data 49154 0 Y 22473 Brick ovirt2:/gluster/brick5/data 49154 0 Y 20748 Brick ovirt3:/gluster/brick5/data 49154 0 Y 23156 Brick ovirt0:/gluster/brick6/data 49155 0 Y 22479 Brick ovirt2:/gluster/brick6_1/data 49161 0 Y 21203 Brick ovirt3:/gluster/brick6/data 49155 0 Y 23157 Brick ovirt0:/gluster/brick7/data 49156 0 Y 22485 Brick ovirt2:/gluster/brick7/data 49156 0 Y 20763 Brick ovirt3:/gluster/brick7/data 49156 0 Y 23155 Brick ovirt0:/gluster/brick8/data 49157 0 Y 22491 Brick ovirt2:/gluster/brick8/data 49157 0 Y 20771 Brick ovirt3:/gluster/brick8/data 49157 0 Y 23154 Self-heal Daemon on localhost N/A N/A Y 23238 Bitrot Daemon on localhost N/A N/A Y 24870 Scrubber Daemon on localhost N/A N/A Y 24889 Self-heal Daemon on ovirt2 N/A N/A Y 24271 Bitrot Daemon on ovirt2 N/A N/A Y 24856 Scrubber Daemon on ovirt2 N/A N/A Y 24866 Self-heal Daemon on ovirt0 N/A N/A Y 29409 Bitrot Daemon on ovirt0 N/A N/A Y 5457 Scrubber Daemon on ovirt0 N/A N/A Y 5468 Task Status of Volume data ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: engine Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick ovirt0:/gluster/brick1/engine 49158 0 Y 22511 Brick ovirt2:/gluster/brick1/engine 49158 0 Y 20780 Brick ovirt3:/gluster/brick1/engine 49158 0 Y 23199 Self-heal Daemon on localhost N/A N/A Y 23238 Self-heal Daemon on ovirt0 N/A N/A Y 29409 Self-heal Daemon on ovirt2 N/A N/A Y 24271 Task Status of Volume engine ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: iso Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick ovirt0:/gluster/brick2/iso 49159 0 Y 22520 Brick ovirt2:/gluster/brick2/iso 49159 0 Y 20789 Brick ovirt3:/gluster/brick2/iso 49159 0 Y 23208 NFS Server on localhost N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 23238 NFS Server on ovirt2 N/A N/A N N/A Self-heal Daemon on ovirt2 N/A N/A Y 24271 NFS Server on ovirt0 N/A N/A N N/A Self-heal Daemon on ovirt0 N/A N/A Y 29409 Task Status of Volume iso ------------------------------------------------------------------------------ There are no active volume tasks 2018-01-17 8:13 GMT+01:00 Gobinda Das <godas@redhat.com>:
Hi, I can see some error in log: [2018-01-14 11:19:49.886571] E [socket.c:2309:socket_connect_finish] 0-engine-client-0: connection to 10.2.0.120:24007 failed (Connection timed out) [2018-01-14 11:20:05.630669] E [socket.c:2309:socket_connect_finish] 0-engine-client-0: connection to 10.2.0.120:24007 failed (Connection timed out) [2018-01-14 12:01:09.089925] E [MSGID: 114058] [client-handshake.c:1527:client_query_portmap_cbk] 0-engine-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2018-01-14 12:01:09.090048] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 0-engine-client-0: disconnected from engine-client-0. Client process will keep trying to connect to glusterd until brick's port is available
Can you please check gluster volume status and see if all bricks are up?
On Wed, Jan 17, 2018 at 12:24 PM, Endre Karlson <endre.karlson@gmail.com> wrote:
It's there now for each of the hosts. ovirt1 is not in service yet.
2018-01-17 5:52 GMT+01:00 Gobinda Das <godas@redhat.com>:
In the above url only data and iso mnt log present,But there is no engine and vmstore mount log.
On Wed, Jan 17, 2018 at 1:26 AM, Endre Karlson <endre.karlson@gmail.com> wrote:
Hi, all logs are located here: https://www.dropbox.com/ sh/3qzmwe76rkt09fk/AABzM9rJKbH5SBPWc31Npxhma?dl=0 for the mounts
additionally we replaced a broken disk that is now resynced.
2018-01-15 11:17 GMT+01:00 Gobinda Das <godas@redhat.com>:
Hi Endre, Mount logs will be in below format inside /var/log/glusterfs :
/var/log/glusterfs/rhev-data-center-mnt-glusterSD-*\:_engine.log /var/log/glusterfs/rhev-data-center-mnt-glusterSD-*\:_data.log /var/log/glusterfs/rhev-data-center-mnt-glusterSD-*\:_vmstore.log
On Mon, Jan 15, 2018 at 11:57 AM, Endre Karlson < endre.karlson@gmail.com> wrote:
Hi.
What are the gluster mount logs ?
I have these gluster logs. cli.log etc-glusterfs-glusterd.vol.log glfsheal-engine.log glusterd.log nfs.log rhev-data-center-mnt-glusterSD-ovirt0:_engine.log rhev-data-center-mnt-glusterSD-ovirt3:_iso.log cmd_history.log glfsheal-data.log glfsheal-iso.log glustershd.log rhev-data-center-mnt-glusterSD-ovirt0:_data.log rhev-data-center-mnt-glusterSD-ovirt0:_iso.log statedump.log
I am running version glusterfs-server-3.12.4-1.el7.x86_64 glusterfs-geo-replication-3.12.4-1.el7.x86_64 libvirt-daemon-driver-storage-gluster-3.2.0-14.el7_4.7.x86_64 glusterfs-libs-3.12.4-1.el7.x86_64 glusterfs-api-3.12.4-1.el7.x86_64 python2-gluster-3.12.4-1.el7.x86_64 glusterfs-client-xlators-3.12.4-1.el7.x86_64 glusterfs-cli-3.12.4-1.el7.x86_64 glusterfs-events-3.12.4-1.el7.x86_64 glusterfs-rdma-3.12.4-1.el7.x86_64 vdsm-gluster-4.20.9.3-1.el7.centos.noarch glusterfs-3.12.4-1.el7.x86_64 glusterfs-fuse-3.12.4-1.el7.x86_64
// Endre
2018-01-15 6:11 GMT+01:00 Gobinda Das <godas@redhat.com>:
> Hi Endre, > Can you please provide glusterfs mount logs? > > On Mon, Jan 15, 2018 at 6:16 AM, Darrell Budic < > budic@onholyground.com> wrote: > >> What version of gluster are you running? I’ve seen a few of these >> since moving my storage cluster to 12.3, but still haven’t been able to >> determine what’s causing it. Seems to be happening most often on VMs that >> haven’t been switches over to libgfapi mounts yet, but even one of those >> has paused once so far. They generally restart fine from the GUI, and >> nothing seems to need healing. >> >> ------------------------------ >> *From:* Endre Karlson <endre.karlson@gmail.com> >> *Subject:* [ovirt-users] Problems with some vms >> *Date:* January 14, 2018 at 12:55:45 PM CST >> *To:* users >> >> Hi, we are getting some errors with some of our vms in a 3 node >> server setup. >> >> 2018-01-14 15:01:44,015+0100 INFO (libvirt/events) [virt.vm] >> (vmId='2c34f52d-140b-4dbe-a4bd-d2cb467b0b7c') abnormal vm stop >> device virtio-disk0 error eother (vm:4880) >> >> We are running glusterfs for shared storage. >> >> I have tried setting global maintenance on the first server and >> then issuing a 'hosted-engine --vm-start' but that leads to nowhere. >> _______________________________________________ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >> >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >> > > > -- > Thanks, > Gobinda > +91-9019047912 <+91%2090190%2047912> >
-- Thanks, Gobinda +91-9019047912 <+91%2090190%2047912>
-- Thanks, Gobinda +91-9019047912 <+91%2090190%2047912>
-- Thanks, Gobinda +91-9019047912 <+91%2090190%2047912>

Do anyone have any ideas on this? 2018-01-17 12:07 GMT+01:00 Endre Karlson <endre.karlson@gmail.com>:
One brick was at a point down for replacement.
It has been replaced and all vols are up
Status of volume: data Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------ ------------------ Brick ovirt0:/gluster/brick3/data 49152 0 Y 22467 Brick ovirt2:/gluster/brick3/data 49152 0 Y 20736 Brick ovirt3:/gluster/brick3/data 49152 0 Y 23148 Brick ovirt0:/gluster/brick4/data 49153 0 Y 22497 Brick ovirt2:/gluster/brick4/data 49153 0 Y 20742 Brick ovirt3:/gluster/brick4/data 49153 0 Y 23158 Brick ovirt0:/gluster/brick5/data 49154 0 Y 22473 Brick ovirt2:/gluster/brick5/data 49154 0 Y 20748 Brick ovirt3:/gluster/brick5/data 49154 0 Y 23156 Brick ovirt0:/gluster/brick6/data 49155 0 Y 22479 Brick ovirt2:/gluster/brick6_1/data 49161 0 Y 21203 Brick ovirt3:/gluster/brick6/data 49155 0 Y 23157 Brick ovirt0:/gluster/brick7/data 49156 0 Y 22485 Brick ovirt2:/gluster/brick7/data 49156 0 Y 20763 Brick ovirt3:/gluster/brick7/data 49156 0 Y 23155 Brick ovirt0:/gluster/brick8/data 49157 0 Y 22491 Brick ovirt2:/gluster/brick8/data 49157 0 Y 20771 Brick ovirt3:/gluster/brick8/data 49157 0 Y 23154 Self-heal Daemon on localhost N/A N/A Y 23238 Bitrot Daemon on localhost N/A N/A Y 24870 Scrubber Daemon on localhost N/A N/A Y 24889 Self-heal Daemon on ovirt2 N/A N/A Y 24271 Bitrot Daemon on ovirt2 N/A N/A Y 24856 Scrubber Daemon on ovirt2 N/A N/A Y 24866 Self-heal Daemon on ovirt0 N/A N/A Y 29409 Bitrot Daemon on ovirt0 N/A N/A Y 5457 Scrubber Daemon on ovirt0 N/A N/A Y 5468
Task Status of Volume data ------------------------------------------------------------ ------------------ There are no active volume tasks
Status of volume: engine Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------ ------------------ Brick ovirt0:/gluster/brick1/engine 49158 0 Y 22511 Brick ovirt2:/gluster/brick1/engine 49158 0 Y 20780 Brick ovirt3:/gluster/brick1/engine 49158 0 Y 23199 Self-heal Daemon on localhost N/A N/A Y 23238 Self-heal Daemon on ovirt0 N/A N/A Y 29409 Self-heal Daemon on ovirt2 N/A N/A Y 24271
Task Status of Volume engine ------------------------------------------------------------ ------------------ There are no active volume tasks
Status of volume: iso Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------ ------------------ Brick ovirt0:/gluster/brick2/iso 49159 0 Y 22520 Brick ovirt2:/gluster/brick2/iso 49159 0 Y 20789 Brick ovirt3:/gluster/brick2/iso 49159 0 Y 23208 NFS Server on localhost N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 23238 NFS Server on ovirt2 N/A N/A N N/A Self-heal Daemon on ovirt2 N/A N/A Y 24271 NFS Server on ovirt0 N/A N/A N N/A Self-heal Daemon on ovirt0 N/A N/A Y 29409
Task Status of Volume iso ------------------------------------------------------------ ------------------ There are no active volume tasks
2018-01-17 8:13 GMT+01:00 Gobinda Das <godas@redhat.com>:
Hi, I can see some error in log: [2018-01-14 11:19:49.886571] E [socket.c:2309:socket_connect_finish] 0-engine-client-0: connection to 10.2.0.120:24007 failed (Connection timed out) [2018-01-14 11:20:05.630669] E [socket.c:2309:socket_connect_finish] 0-engine-client-0: connection to 10.2.0.120:24007 failed (Connection timed out) [2018-01-14 12:01:09.089925] E [MSGID: 114058] [client-handshake.c:1527:client_query_portmap_cbk] 0-engine-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2018-01-14 12:01:09.090048] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 0-engine-client-0: disconnected from engine-client-0. Client process will keep trying to connect to glusterd until brick's port is available
Can you please check gluster volume status and see if all bricks are up?
On Wed, Jan 17, 2018 at 12:24 PM, Endre Karlson <endre.karlson@gmail.com> wrote:
It's there now for each of the hosts. ovirt1 is not in service yet.
2018-01-17 5:52 GMT+01:00 Gobinda Das <godas@redhat.com>:
In the above url only data and iso mnt log present,But there is no engine and vmstore mount log.
On Wed, Jan 17, 2018 at 1:26 AM, Endre Karlson <endre.karlson@gmail.com
wrote:
Hi, all logs are located here: https://www.dropbox.com/ sh/3qzmwe76rkt09fk/AABzM9rJKbH5SBPWc31Npxhma?dl=0 for the mounts
additionally we replaced a broken disk that is now resynced.
2018-01-15 11:17 GMT+01:00 Gobinda Das <godas@redhat.com>:
Hi Endre, Mount logs will be in below format inside /var/log/glusterfs :
/var/log/glusterfs/rhev-data-center-mnt-glusterSD-*\:_engine.log /var/log/glusterfs/rhev-data-center-mnt-glusterSD-*\:_data.log /var/log/glusterfs/rhev-data-center-mnt-glusterSD-*\:_vmstore.log
On Mon, Jan 15, 2018 at 11:57 AM, Endre Karlson < endre.karlson@gmail.com> wrote:
> Hi. > > What are the gluster mount logs ? > > I have these gluster logs. > cli.log etc-glusterfs-glusterd.vol.log > glfsheal-engine.log glusterd.log nfs.log > rhev-data-center-mnt-glusterSD-ovirt0:_engine.log > rhev-data-center-mnt-glusterSD-ovirt3:_iso.log > cmd_history.log glfsheal-data.log glfsheal-iso.log > glustershd.log rhev-data-center-mnt-glusterSD-ovirt0:_data.log > rhev-data-center-mnt-glusterSD-ovirt0:_iso.log statedump.log > > > I am running version > glusterfs-server-3.12.4-1.el7.x86_64 > glusterfs-geo-replication-3.12.4-1.el7.x86_64 > libvirt-daemon-driver-storage-gluster-3.2.0-14.el7_4.7.x86_64 > glusterfs-libs-3.12.4-1.el7.x86_64 > glusterfs-api-3.12.4-1.el7.x86_64 > python2-gluster-3.12.4-1.el7.x86_64 > glusterfs-client-xlators-3.12.4-1.el7.x86_64 > glusterfs-cli-3.12.4-1.el7.x86_64 > glusterfs-events-3.12.4-1.el7.x86_64 > glusterfs-rdma-3.12.4-1.el7.x86_64 > vdsm-gluster-4.20.9.3-1.el7.centos.noarch > glusterfs-3.12.4-1.el7.x86_64 > glusterfs-fuse-3.12.4-1.el7.x86_64 > > // Endre > > 2018-01-15 6:11 GMT+01:00 Gobinda Das <godas@redhat.com>: > >> Hi Endre, >> Can you please provide glusterfs mount logs? >> >> On Mon, Jan 15, 2018 at 6:16 AM, Darrell Budic < >> budic@onholyground.com> wrote: >> >>> What version of gluster are you running? I’ve seen a few of these >>> since moving my storage cluster to 12.3, but still haven’t been able to >>> determine what’s causing it. Seems to be happening most often on VMs that >>> haven’t been switches over to libgfapi mounts yet, but even one of those >>> has paused once so far. They generally restart fine from the GUI, and >>> nothing seems to need healing. >>> >>> ------------------------------ >>> *From:* Endre Karlson <endre.karlson@gmail.com> >>> *Subject:* [ovirt-users] Problems with some vms >>> *Date:* January 14, 2018 at 12:55:45 PM CST >>> *To:* users >>> >>> Hi, we are getting some errors with some of our vms in a 3 node >>> server setup. >>> >>> 2018-01-14 15:01:44,015+0100 INFO (libvirt/events) [virt.vm] >>> (vmId='2c34f52d-140b-4dbe-a4bd-d2cb467b0b7c') abnormal vm stop >>> device virtio-disk0 error eother (vm:4880) >>> >>> We are running glusterfs for shared storage. >>> >>> I have tried setting global maintenance on the first server and >>> then issuing a 'hosted-engine --vm-start' but that leads to nowhere. >>> _______________________________________________ >>> Users mailing list >>> Users@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/users >>> >>> >>> >>> _______________________________________________ >>> Users mailing list >>> Users@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/users >>> >>> >> >> >> -- >> Thanks, >> Gobinda >> +91-9019047912 <+91%2090190%2047912> >> > >
-- Thanks, Gobinda +91-9019047912 <+91%2090190%2047912>
-- Thanks, Gobinda +91-9019047912 <+91%2090190%2047912>
-- Thanks, Gobinda +91-9019047912 <+91%2090190%2047912>
participants (1)
-
Endre Karlson