On Thu, Apr 9, 2020 at 1:11 PM Gianluca Cecchi <gianluca.cecchi(a)gmail.com>
wrote:
> This ^^, right here is the reason the VM paused. Are you using a plain
> distribute volume here?
> Can you share some of the log messages that occur right above these
> errors?
> Also, can you check if the file
> $VMSTORE_BRICKPATH/.glusterfs/d2/25/d22530cf-2e50-4059-8924-0aafe38497b1
> exists on the brick?
>
> -Krutika
>
>
Thanks for answering Krutika
To verify that sharding in some way was "involved" in the problem, I
executed a new re-deploy of the 9 Openshift OCP servers, without indeed
receiving any error.
While with sharding enable I received at least 3-4 errors every deployment
run.
In particular I deleted the VM disks of the previous VMs to put them on a
volume without sharding.
Right now the directory is so empty:
[root@ovirt ~]# ll -a /gluster_bricks/vmstore/vmstore/.glusterfs/d2/25/
total 8
drwx------. 2 root root 6 Apr 8 16:59 .
drwx------. 105 root root 8192 Apr 9 00:50 ..
[root@ovirt ~]#
Here you can find the entire log (in gzip format) from [2020-04-05
01:20:02.978429] to [2020-04-09 10:45:36.734079] of the vmstore volume
https://drive.google.com/file/d/1Dqr7KJMqKdMFg-jvhsDAzvr1xgWtvtnQ/view?us...
You will find same error at least in these timestamps below corresponding
to engine webadmin events "unknown storage error", taking care that inside
the log file the time is UTC, so you have to shift 2hours behind (03:27:28
PM in engine webadmin event corresponds to 13:27:28 in log file)
Apr 7, 2020, 3:27:28 PM
Apr 7, 2020, 4:38:55 PM
Apr 7, 2020, 5:31:02 PM
Apr 8, 2020, 8:52:49 AM
Apr 8, 2020, 12:05:17 PM
Apr 8, 2020, 3:11:10 PM
Apr 8, 2020, 3:20:30 PM
Apr 8, 2020, 3:26:54 PM
Thanks again, and I'm available to re-try on sharding enable volume after
modifying anything, eventually
Gianluca
Hi Krutika,
did you have the opportunity to verify log content I sent and better
understand the reason of sharding errors and possible solution?
Thanks,
Gianluca