I think this is the profile information for one of the volumes that lives
on the SSDs and is fully operational with no down/problem disks:
[root@ovirt2 yum.repos.d]# gluster volume profile data info
Brick: ovirt2.nwfiber.com:/gluster/brick2/data
----------------------------------------------
Cumulative Stats:
Block Size: 256b+ 512b+
1024b+
No. of Reads: 983 2696
1059
No. of Writes: 0 1113
302
Block Size: 2048b+ 4096b+
8192b+
No. of Reads: 852 88608
53526
No. of Writes: 522 812340
76257
Block Size: 16384b+ 32768b+
65536b+
No. of Reads: 54351 241901
15024
No. of Writes: 21636 8656
8976
Block Size: 131072b+
No. of Reads: 524156
No. of Writes: 296071
%-latency Avg-latency Min-Latency Max-Latency No. of calls
Fop
--------- ----------- ----------- ----------- ------------
----
0.00 0.00 us 0.00 us 0.00 us 4189
RELEASE
0.00 0.00 us 0.00 us 0.00 us 1257
RELEASEDIR
0.00 46.19 us 12.00 us 187.00 us 69
FLUSH
0.00 147.00 us 78.00 us 367.00 us 86
REMOVEXATTR
0.00 223.46 us 24.00 us 1166.00 us 149
READDIR
0.00 565.34 us 76.00 us 3639.00 us 88
FTRUNCATE
0.00 263.28 us 20.00 us 28385.00 us 228
LK
0.00 98.84 us 2.00 us 880.00 us 1198
OPENDIR
0.00 91.59 us 26.00 us 10371.00 us 3853
STATFS
0.00 494.14 us 17.00 us 193439.00 us 1171
GETXATTR
0.00 299.42 us 35.00 us 9799.00 us 2044
READDIRP
0.00 1965.31 us 110.00 us 382258.00 us 321
XATTROP
0.01 113.40 us 24.00 us 61061.00 us 8134
STAT
0.01 755.38 us 57.00 us 607603.00 us 3196
DISCARD
0.05 2690.09 us 58.00 us 2704761.00 us 3206
OPEN
0.10 119978.25 us 97.00 us 9406684.00 us 154
SETATTR
0.18 101.73 us 28.00 us 700477.00 us 313379
FSTAT
0.23 1059.84 us 25.00 us 2716124.00 us 38255
LOOKUP
0.47 1024.11 us 54.00 us 6197164.00 us 81455
FXATTROP
1.72 2984.00 us 15.00 us 37098954.00 us 103020
FINODELK
5.92 44315.32 us 51.00 us 24731536.00 us 23957
FSYNC
13.27 2399.78 us 25.00 us 22089540.00 us 991005
READ
37.00 5980.43 us 52.00 us 22099889.00 us 1108976
WRITE
41.04 5452.75 us 13.00 us 22102452.00 us 1349053
INODELK
Duration: 10026 seconds
Data Read: 80046027759 bytes
Data Written: 44496632320 bytes
Interval 1 Stats:
Block Size: 256b+ 512b+
1024b+
No. of Reads: 983 2696
1059
No. of Writes: 0 838
185
Block Size: 2048b+ 4096b+
8192b+
No. of Reads: 852 85856
51575
No. of Writes: 382 705802
57812
Block Size: 16384b+ 32768b+
65536b+
No. of Reads: 52673 232093
14984
No. of Writes: 13499 4908
4242
Block Size: 131072b+
No. of Reads: 460040
No. of Writes: 6411
%-latency Avg-latency Min-Latency Max-Latency No. of calls
Fop
--------- ----------- ----------- ----------- ------------
----
0.00 0.00 us 0.00 us 0.00 us 2093
RELEASE
0.00 0.00 us 0.00 us 0.00 us 1093
RELEASEDIR
0.00 53.38 us 26.00 us 111.00 us 16
FLUSH
0.00 145.14 us 78.00 us 367.00 us 71
REMOVEXATTR
0.00 190.96 us 114.00 us 298.00 us 71
SETATTR
0.00 213.38 us 24.00 us 1145.00 us 90
READDIR
0.00 263.28 us 20.00 us 28385.00 us 228
LK
0.00 101.76 us 2.00 us 880.00 us 1093
OPENDIR
0.01 93.60 us 27.00 us 10371.00 us 3090
STATFS
0.02 537.47 us 17.00 us 193439.00 us 1038
GETXATTR
0.03 297.44 us 35.00 us 9799.00 us 1990
READDIRP
0.03 2357.28 us 110.00 us 382258.00 us 253
XATTROP
0.04 385.93 us 58.00 us 47593.00 us 2091
OPEN
0.04 114.86 us 24.00 us 61061.00 us 7715
STAT
0.06 444.59 us 57.00 us 333240.00 us 3053
DISCARD
0.42 316.24 us 25.00 us 290728.00 us 29823
LOOKUP
0.73 257.92 us 54.00 us 344812.00 us 63296
FXATTROP
1.37 98.30 us 28.00 us 67621.00 us 313172
FSTAT
1.58 2124.69 us 51.00 us 849200.00 us 16717
FSYNC
5.73 162.46 us 52.00 us 748492.00 us 794079
WRITE
7.19 2065.17 us 16.00 us 37098954.00 us 78381
FINODELK
36.44 886.32 us 25.00 us 2216436.00 us 925421
READ
46.30 1178.04 us 13.00 us 1700704.00 us 884635
INODELK
Duration: 7485 seconds
Data Read: 71250527215 bytes
Data Written: 5119903744 bytes
Brick: ovirt3.nwfiber.com:/gluster/brick2/data
----------------------------------------------
Cumulative Stats:
Block Size: 1b+
No. of Reads: 0
No. of Writes: 3264419
%-latency Avg-latency Min-Latency Max-Latency No. of calls
Fop
--------- ----------- ----------- ----------- ------------
----
0.00 0.00 us 0.00 us 0.00 us 90
FORGET
0.00 0.00 us 0.00 us 0.00 us 9462
RELEASE
0.00 0.00 us 0.00 us 0.00 us 4254
RELEASEDIR
0.00 50.52 us 13.00 us 190.00 us 71
FLUSH
0.00 186.97 us 87.00 us 713.00 us 86
REMOVEXATTR
0.00 79.32 us 33.00 us 189.00 us 228
LK
0.00 220.98 us 129.00 us 513.00 us 86
SETATTR
0.01 259.30 us 26.00 us 2632.00 us 137
READDIR
0.02 322.76 us 145.00 us 2125.00 us 321
XATTROP
0.03 109.55 us 2.00 us 1258.00 us 1193
OPENDIR
0.05 70.21 us 21.00 us 431.00 us 3196
DISCARD
0.05 169.26 us 21.00 us 2315.00 us 1545
GETXATTR
0.12 176.85 us 63.00 us 2844.00 us 3206
OPEN
0.61 303.49 us 90.00 us 3085.00 us 9633
FSTAT
2.44 305.66 us 28.00 us 3716.00 us 38230
LOOKUP
4.52 266.22 us 55.00 us 53424.00 us 81455
FXATTROP
6.96 1397.99 us 51.00 us 64822.00 us 23889
FSYNC
16.48 84.74 us 25.00 us 6917.00 us 932592
WRITE
30.16 106.90 us 13.00 us 3920189.00 us 1353046
INODELK
38.55 1794.52 us 14.00 us 16210553.00 us 103039
FINODELK
Duration: 66562 seconds
Data Read: 0 bytes
Data Written: 3264419 bytes
Interval 1 Stats:
Block Size: 1b+
No. of Reads: 0
No. of Writes: 794080
%-latency Avg-latency Min-Latency Max-Latency No. of calls
Fop
--------- ----------- ----------- ----------- ------------
----
0.00 0.00 us 0.00 us 0.00 us 2093
RELEASE
0.00 0.00 us 0.00 us 0.00 us 1093
RELEASEDIR
0.00 70.31 us 26.00 us 125.00 us 16
FLUSH
0.00 193.10 us 103.00 us 713.00 us 71
REMOVEXATTR
0.01 227.32 us 133.00 us 513.00 us 71
SETATTR
0.01 79.32 us 33.00 us 189.00 us 228
LK
0.01 259.83 us 35.00 us 1138.00 us 89
READDIR
0.03 318.26 us 145.00 us 2047.00 us 253
XATTROP
0.04 112.67 us 3.00 us 1258.00 us 1093
OPENDIR
0.06 167.98 us 23.00 us 1951.00 us 1014
GETXATTR
0.08 70.97 us 22.00 us 431.00 us 3053
DISCARD
0.13 183.78 us 66.00 us 2844.00 us 2091
OPEN
1.01 303.82 us 90.00 us 3085.00 us 9610
FSTAT
3.27 316.59 us 30.00 us 3716.00 us 29820
LOOKUP
5.83 265.79 us 59.00 us 53424.00 us 63296
FXATTROP
7.95 1373.89 us 51.00 us 64822.00 us 16717
FSYNC
23.17 851.99 us 14.00 us 16210553.00 us 78555
FINODELK
24.04 87.44 us 27.00 us 6917.00 us 794081
WRITE
34.36 111.91 us 14.00 us 984871.00 us 886790
INODELK
Duration: 7485 seconds
Data Read: 0 bytes
Data Written: 794080 bytes
-----------------------
Here is the data from the volume that is backed by the SHDDs and has one
failed disk:
[root@ovirt2 yum.repos.d]# gluster volume profile data-hdd info
Brick: 172.172.1.12:/gluster/brick3/data-hdd
--------------------------------------------
Cumulative Stats:
Block Size: 256b+ 512b+
1024b+
No. of Reads: 1702 86
16
No. of Writes: 0 767
71
Block Size: 2048b+ 4096b+
8192b+
No. of Reads: 19 51841
2049
No. of Writes: 76 60668
35727
Block Size: 16384b+ 32768b+
65536b+
No. of Reads: 1744 639
1088
No. of Writes: 8524 2410
1285
Block Size: 131072b+
No. of Reads: 771999
No. of Writes: 29584
%-latency Avg-latency Min-Latency Max-Latency No. of calls
Fop
--------- ----------- ----------- ----------- ------------
----
0.00 0.00 us 0.00 us 0.00 us 2902
RELEASE
0.00 0.00 us 0.00 us 0.00 us 1517
RELEASEDIR
0.00 197.00 us 197.00 us 197.00 us 1
FTRUNCATE
0.00 70.24 us 16.00 us 758.00 us 51
FLUSH
0.00 143.93 us 82.00 us 305.00 us 57
REMOVEXATTR
0.00 178.63 us 105.00 us 712.00 us 60
SETATTR
0.00 67.30 us 19.00 us 572.00 us 555
LK
0.00 322.80 us 23.00 us 4673.00 us 138
READDIR
0.00 336.56 us 106.00 us 11994.00 us 237
XATTROP
0.00 84.70 us 28.00 us 1071.00 us 3469
STATFS
0.01 387.75 us 2.00 us 146017.00 us 1467
OPENDIR
0.01 148.59 us 21.00 us 64374.00 us 4454
STAT
0.02 783.02 us 16.00 us 93502.00 us 1902
GETXATTR
0.03 1516.10 us 17.00 us 210690.00 us 1364
ENTRYLK
0.03 2555.47 us 300.00 us 674454.00 us 1064
READDIRP
0.07 85.74 us 19.00 us 68340.00 us 62849
FSTAT
0.07 1978.12 us 59.00 us 202596.00 us 2729
OPEN
0.22 708.57 us 15.00 us 394799.00 us 25447
LOOKUP
5.94 2331.74 us 15.00 us 1099530.00 us 207534
FINODELK
7.31 8311.75 us 58.00 us 1800216.00 us 71668
FXATTROP
12.49 7735.19 us 51.00 us 3595513.00 us 131642
WRITE
17.70 957.08 us 16.00 us 13700466.00 us 1508160
INODELK
24.55 2546.43 us 26.00 us 5077347.00 us 786060
READ
31.56 49699.15 us 47.00 us 3746331.00 us 51777
FSYNC
Duration: 10101 seconds
Data Read: 101562897361 bytes
Data Written: 4834450432 bytes
Interval 0 Stats:
Block Size: 256b+ 512b+
1024b+
No. of Reads: 1702 86
16
No. of Writes: 0 767
71
Block Size: 2048b+ 4096b+
8192b+
No. of Reads: 19 51841
2049
No. of Writes: 76 60668
35727
Block Size: 16384b+ 32768b+
65536b+
No. of Reads: 1744 639
1088
No. of Writes: 8524 2410
1285
Block Size: 131072b+
No. of Reads: 771999
No. of Writes: 29584
%-latency Avg-latency Min-Latency Max-Latency No. of calls
Fop
--------- ----------- ----------- ----------- ------------
----
0.00 0.00 us 0.00 us 0.00 us 2902
RELEASE
0.00 0.00 us 0.00 us 0.00 us 1517
RELEASEDIR
0.00 197.00 us 197.00 us 197.00 us 1
FTRUNCATE
0.00 70.24 us 16.00 us 758.00 us 51
FLUSH
0.00 143.93 us 82.00 us 305.00 us 57
REMOVEXATTR
0.00 178.63 us 105.00 us 712.00 us 60
SETATTR
0.00 67.30 us 19.00 us 572.00 us 555
LK
0.00 322.80 us 23.00 us 4673.00 us 138
READDIR
0.00 336.56 us 106.00 us 11994.00 us 237
XATTROP
0.00 84.70 us 28.00 us 1071.00 us 3469
STATFS
0.01 387.75 us 2.00 us 146017.00 us 1467
OPENDIR
0.01 148.59 us 21.00 us 64374.00 us 4454
STAT
0.02 783.02 us 16.00 us 93502.00 us 1902
GETXATTR
0.03 1516.10 us 17.00 us 210690.00 us 1364
ENTRYLK
0.03 2555.47 us 300.00 us 674454.00 us 1064
READDIRP
0.07 85.73 us 19.00 us 68340.00 us 62849
FSTAT
0.07 1978.12 us 59.00 us 202596.00 us 2729
OPEN
0.22 708.57 us 15.00 us 394799.00 us 25447
LOOKUP
5.94 2334.57 us 15.00 us 1099530.00 us 207534
FINODELK
7.31 8311.49 us 58.00 us 1800216.00 us 71668
FXATTROP
12.49 7735.32 us 51.00 us 3595513.00 us 131642
WRITE
17.71 957.08 us 16.00 us 13700466.00 us 1508160
INODELK
24.56 2546.42 us 26.00 us 5077347.00 us 786060
READ
31.54 49651.63 us 47.00 us 3746331.00 us 51777
FSYNC
Duration: 10101 seconds
Data Read: 101562897361 bytes
Data Written: 4834450432 bytes
On Tue, May 29, 2018 at 2:55 PM, Jim Kusznir <jim(a)palousetech.com> wrote:
Thank you for your response.
I have 4 gluster volumes. 3 are replica 2 + arbitrator. replica bricks
are on ovirt1 and ovirt2, arbitrator on ovirt3. The 4th volume is replica
3, with a brick on all three ovirt machines.
The first 3 volumes are on an SSD disk; the 4th is on a Seagate SSHD (same
in all three machines). On ovirt3, the SSHD has reported hard IO failures,
and that brick is offline. However, the other two replicas are fully
operational (although they still show contents in the heal info command
that won't go away, but that may be the case until I replace the failed
disk).
What is bothering me is that ALL 4 gluster volumes are showing horrible
performance issues. At this point, as the bad disk has been completely
offlined, I would expect gluster to perform at normal speed, but that is
definitely not the case.
I've also noticed that the performance hits seem to come in waves: things
seem to work acceptably (but slow) for a while, then suddenly, its as if
all disk IO on all volumes (including non-gluster local OS disk volumes for
the hosts) pause for about 30 seconds, then IO resumes again. During those
times, I start getting VM not responding and host not responding notices as
well as the applications having major issues.
I've shut down most of my VMs and am down to just my essential core VMs
(shedded about 75% of my VMs). I still am experiencing the same issues.
Am I correct in believing that once the failed disk was brought offline
that performance should return to normal?
On Tue, May 29, 2018 at 1:27 PM, Alex K <rightkicktech(a)gmail.com> wrote:
> I would check disks status and accessibility of mount points where your
> gluster volumes reside.
>
> On Tue, May 29, 2018, 22:28 Jim Kusznir <jim(a)palousetech.com> wrote:
>
>> On one ovirt server, I'm now seeing these messages:
>> [56474.239725] blk_update_request: 63 callbacks suppressed
>> [56474.239732] blk_update_request: I/O error, dev dm-2, sector 0
>> [56474.240602] blk_update_request: I/O error, dev dm-2, sector 3905945472
>> [56474.241346] blk_update_request: I/O error, dev dm-2, sector 3905945584
>> [56474.242236] blk_update_request: I/O error, dev dm-2, sector 2048
>> [56474.243072] blk_update_request: I/O error, dev dm-2, sector 3905943424
>> [56474.243997] blk_update_request: I/O error, dev dm-2, sector 3905943536
>> [56474.247347] blk_update_request: I/O error, dev dm-2, sector 0
>> [56474.248315] blk_update_request: I/O error, dev dm-2, sector 3905945472
>> [56474.249231] blk_update_request: I/O error, dev dm-2, sector 3905945584
>> [56474.250221] blk_update_request: I/O error, dev dm-2, sector 2048
>>
>>
>>
>>
>> On Tue, May 29, 2018 at 11:59 AM, Jim Kusznir <jim(a)palousetech.com>
>> wrote:
>>
>>> I see in messages on ovirt3 (my 3rd machine, the one upgraded to 4.2):
>>>
>>> May 29 11:54:41 ovirt3 ovs-vsctl:
ovs|00001|db_ctl_base|ERR|unix:/var/run/openvswitch/db.sock:
>>> database connection failed (No such file or directory)
>>> May 29 11:54:51 ovirt3 ovs-vsctl:
ovs|00001|db_ctl_base|ERR|unix:/var/run/openvswitch/db.sock:
>>> database connection failed (No such file or directory)
>>> May 29 11:55:01 ovirt3 ovs-vsctl:
ovs|00001|db_ctl_base|ERR|unix:/var/run/openvswitch/db.sock:
>>> database connection failed (No such file or directory)
>>> (appears a lot).
>>>
>>> I also found on the ssh session of that, some sysv warnings about the
>>> backing disk for one of the gluster volumes (straight replica 3). The
>>> glusterfs process for that disk on that machine went offline. Its my
>>> understanding that it should continue to work with the other two machines
>>> while I attempt to replace that disk, right? Attempted writes (touching an
>>> empty file) can take 15 seconds, repeating it later will be much faster.
>>>
>>> Gluster generates a bunch of different log files, I don't know what
>>> ones you want, or from which machine(s).
>>>
>>> How do I do "volume profiling"?
>>>
>>> Thanks!
>>>
>>> On Tue, May 29, 2018 at 11:53 AM, Sahina Bose <sabose(a)redhat.com>
>>> wrote:
>>>
>>>> Do you see errors reported in the mount logs for the volume? If so,
>>>> could you attach the logs?
>>>> Any issues with your underlying disks. Can you also attach output of
>>>> volume profiling?
>>>>
>>>> On Wed, May 30, 2018 at 12:13 AM, Jim Kusznir
<jim(a)palousetech.com>
>>>> wrote:
>>>>
>>>>> Ok, things have gotten MUCH worse this morning. I'm getting
random
>>>>> errors from VMs, right now, about a third of my VMs have been paused
due to
>>>>> storage issues, and most of the remaining VMs are not performing
well.
>>>>>
>>>>> At this point, I am in full EMERGENCY mode, as my production
services
>>>>> are now impacted, and I'm getting calls coming in with
problems...
>>>>>
>>>>> I'd greatly appreciate help...VMs are running VERY slowly (when
they
>>>>> run), and they are steadily getting worse. I don't know why. I
was seeing
>>>>> CPU peaks (to 100%) on several VMs, in perfect sync, for a few
minutes at a
>>>>> time (while the VM became unresponsive and any VMs I was logged into
that
>>>>> were linux were giving me the CPU stuck messages in my origional
post). Is
>>>>> all this storage related?
>>>>>
>>>>> I also have two different gluster volumes for VM storage, and only
>>>>> one had the issues, but now VMs in both are being affected at the
same time
>>>>> and same way.
>>>>>
>>>>> --Jim
>>>>>
>>>>> On Mon, May 28, 2018 at 10:50 PM, Sahina Bose
<sabose(a)redhat.com>
>>>>> wrote:
>>>>>
>>>>>> [Adding gluster-users to look at the heal issue]
>>>>>>
>>>>>> On Tue, May 29, 2018 at 9:17 AM, Jim Kusznir
<jim(a)palousetech.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hello:
>>>>>>>
>>>>>>> I've been having some cluster and gluster performance
issues
>>>>>>> lately. I also found that my cluster was out of date, and
was trying to
>>>>>>> apply updates (hoping to fix some of these), and discovered
the ovirt 4.1
>>>>>>> repos were taken completely offline. So, I was forced to
begin an upgrade
>>>>>>> to 4.2. According to docs I found/read, I needed only add
the new repo, do
>>>>>>> a yum update, reboot, and be good on my hosts (did the yum
update, the
>>>>>>> engine-setup on my hosted engine). Things seemed to work
relatively well,
>>>>>>> except for a gluster sync issue that showed up.
>>>>>>>
>>>>>>> My cluster is a 3 node hyperconverged cluster. I upgraded
the
>>>>>>> hosted engine first, then engine 3. When engine 3 came back
up, for some
>>>>>>> reason one of my gluster volumes would not sync. Here's
sample output:
>>>>>>>
>>>>>>> [root@ovirt3 ~]# gluster volume heal data-hdd info
>>>>>>> Brick 172.172.1.11:/gluster/brick3/data-hdd
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/48d7ecb8-7ac5-
>>>>>>> 4725-bca5-b3519681cf2f/0d6080b0-7018-4fa3-bb82-1dd9ef07d9b9
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/647be733-f153-
>>>>>>> 4cdc-85bd-ba72544c2631/b453a300-0602-4be1-8310-8bd5abe00971
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/6da854d1-b6be-
>>>>>>> 446b-9bf0-90a0dbbea830/3c93bd1f-b7fa-4aa2-b445-6904e31839ba
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/7f647567-d18c-
>>>>>>> 44f1-a58e-9b8865833acb/f9364470-9770-4bb1-a6b9-a54861849625
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/f3c8e7aa-6ef2-
>>>>>>> 42a7-93d4-e0a4df6dd2fa/2eb0b1ad-2606-44ef-9cd3-ae59610a504b
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/b1ea3f62-0f05-
>>>>>>> 4ded-8c82-9c91c90e0b61/d5d6bf5a-499f-431d-9013-5453db93ed32
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/8c8b5147-e9d6-
>>>>>>> 4810-b45b-185e3ed65727/16f08231-93b0-489d-a2fd-687b6bf88eaa
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/12924435-b9c2-
>>>>>>> 4aab-ba19-1c1bc31310ef/07b3db69-440e-491e-854c-bbfa18a7cff2
>>>>>>> Status: Connected
>>>>>>> Number of entries: 8
>>>>>>>
>>>>>>> Brick 172.172.1.12:/gluster/brick3/data-hdd
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/48d7ecb8-7ac5-
>>>>>>> 4725-bca5-b3519681cf2f/0d6080b0-7018-4fa3-bb82-1dd9ef07d9b9
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/647be733-f153-
>>>>>>> 4cdc-85bd-ba72544c2631/b453a300-0602-4be1-8310-8bd5abe00971
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/b1ea3f62-0f05-
>>>>>>> 4ded-8c82-9c91c90e0b61/d5d6bf5a-499f-431d-9013-5453db93ed32
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/6da854d1-b6be-
>>>>>>> 446b-9bf0-90a0dbbea830/3c93bd1f-b7fa-4aa2-b445-6904e31839ba
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/7f647567-d18c-
>>>>>>> 44f1-a58e-9b8865833acb/f9364470-9770-4bb1-a6b9-a54861849625
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/8c8b5147-e9d6-
>>>>>>> 4810-b45b-185e3ed65727/16f08231-93b0-489d-a2fd-687b6bf88eaa
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/12924435-b9c2-
>>>>>>> 4aab-ba19-1c1bc31310ef/07b3db69-440e-491e-854c-bbfa18a7cff2
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/f3c8e7aa-6ef2-
>>>>>>> 42a7-93d4-e0a4df6dd2fa/2eb0b1ad-2606-44ef-9cd3-ae59610a504b
>>>>>>> Status: Connected
>>>>>>> Number of entries: 8
>>>>>>>
>>>>>>> Brick 172.172.1.13:/gluster/brick3/data-hdd
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/b1ea3f62-0f05-
>>>>>>> 4ded-8c82-9c91c90e0b61/d5d6bf5a-499f-431d-9013-5453db93ed32
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/8c8b5147-e9d6-
>>>>>>> 4810-b45b-185e3ed65727/16f08231-93b0-489d-a2fd-687b6bf88eaa
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/12924435-b9c2-
>>>>>>> 4aab-ba19-1c1bc31310ef/07b3db69-440e-491e-854c-bbfa18a7cff2
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/f3c8e7aa-6ef2-
>>>>>>> 42a7-93d4-e0a4df6dd2fa/2eb0b1ad-2606-44ef-9cd3-ae59610a504b
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/647be733-f153-
>>>>>>> 4cdc-85bd-ba72544c2631/b453a300-0602-4be1-8310-8bd5abe00971
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/48d7ecb8-7ac5-
>>>>>>> 4725-bca5-b3519681cf2f/0d6080b0-7018-4fa3-bb82-1dd9ef07d9b9
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/6da854d1-b6be-
>>>>>>> 446b-9bf0-90a0dbbea830/3c93bd1f-b7fa-4aa2-b445-6904e31839ba
>>>>>>> /cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/7f647567-d18c-
>>>>>>> 44f1-a58e-9b8865833acb/f9364470-9770-4bb1-a6b9-a54861849625
>>>>>>> Status: Connected
>>>>>>> Number of entries: 8
>>>>>>>
>>>>>>> ---------
>>>>>>> Its been in this state for a couple days now, and bandwidth
>>>>>>> monitoring shows no appreciable data moving. I've tried
repeatedly
>>>>>>> commanding a full heal from all three clusters in the node.
Its always the
>>>>>>> same files that need healing.
>>>>>>>
>>>>>>> When running gluster volume heal data-hdd statistics, I see
>>>>>>> sometimes different information, but always some number of
"heal failed"
>>>>>>> entries. It shows 0 for split brain.
>>>>>>>
>>>>>>> I'm not quite sure what to do. I suspect it may be due
to nodes 1
>>>>>>> and 2 still being on the older ovirt/gluster release, but
I'm afraid to
>>>>>>> upgrade and reboot them until I have a good gluster sync
(don't need to
>>>>>>> create a split brain issue). How do I proceed with this?
>>>>>>>
>>>>>>> Second issue: I've been experiencing VERY POOR
performance on most
>>>>>>> of my VMs. To the tune that logging into a windows 10 vm via
remote
>>>>>>> desktop can take 5 minutes, launching quickbooks inside said
vm can easily
>>>>>>> take 10 minutes. On some linux VMs, I get random messages
like this:
>>>>>>> Message from syslogd@unifi at May 28 20:39:23 ...
>>>>>>> kernel:[6171996.308904] NMI watchdog: BUG: soft lockup -
CPU#0
>>>>>>> stuck for 22s! [mongod:14766]
>>>>>>>
>>>>>>> (the process and PID are often different)
>>>>>>>
>>>>>>> I'm not quite sure what to do about this either. My
initial
>>>>>>> thought was upgrad everything to current and see if its still
there, but I
>>>>>>> cannot move forward with that until my gluster is healed...
>>>>>>>
>>>>>>> Thanks!
>>>>>>> --Jim
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Users mailing list -- users(a)ovirt.org
>>>>>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>>>>>> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
>>>>>>> oVirt Code of Conduct:
https://www.ovirt.org/communit
>>>>>>> y/about/community-guidelines/
>>>>>>> List Archives:
https://lists.ovirt.org/archiv
>>>>>>>
es/list/users(a)ovirt.org/message/3LEV6ZQ3JV2XLAL7NYBTXOYMYUOTIRQF/
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
https://www.ovirt.org/communit
>> y/about/community-guidelines/
>> List Archives:
https://lists.ovirt.org/archiv
>> es/list/users(a)ovirt.org/message/ACO7RFSLBSRBAIONIC2HQ6Z24ZDES5MF/
>>
>