Sometimes paused due to unknown storage error on gluster

Hello, having deployed oVirt 4.3.9 single host HCI with Gluster, I see some times VM going into paused state for the error above and needing to manually run it (sometimes this resumal operation fails). Actually it only happened with empty disk (thin provisioned) and sudden high I/O during the initial phase of install of the OS; it didn't happened then during normal operaton (even with 600MB/s of throughput). I suspect something related to metadata extension not able to be in pair with the speed of the physical disk growing.... similar to what happens for block based storage domains where the LVM layer has to extend the logical volume representing the virtual disk My real world reproduction of the error is during install of OCP 4.3.8 master node, when Red Hat Cores OS boots from network and wipes the disk and I think then transfer an image, so doing high immediate I/O. The VM used as master node has been created with a 120Gb thin provisioned disk (virtio-scsi type) and starts with disk just initialized and empty, going through PXE install. I get this line inside events for the VM Mar 27, 2020, 12:35:23 AM VM master01 has been paused due to unknown storage error. Here logs around the time frame above: - engine.log https://drive.google.com/file/d/1zpNo5IgFVTAlKXHiAMTL-uvaoXSNMVRO/view?usp=s... - vdsm.log https://drive.google.com/file/d/1v8kR0N6PdHBJ5hYzEYKl4-m7v1Lb_cYX/view?usp=s... Any suggestions? The disk of the VM is on vmstore storage domain and its gluster volume settings are: [root@ovirt tmp]# gluster volume info vmstore Volume Name: vmstore Type: Distribute Volume ID: a6203d77-3b9d-49f9-94c5-9e30562959c4 Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: ovirtst.mydomain.storage:/gluster_bricks/vmstore/vmstore Options Reconfigured: performance.low-prio-threads: 32 storage.owner-gid: 36 performance.read-ahead: off user.cifs: off storage.owner-uid: 36 performance.io-cache: off performance.quick-read: off network.ping-timeout: 30 features.shard: on network.remote-dio: off cluster.eager-lock: enable performance.strict-o-direct: on transport.address-family: inet nfs.disable: on [root@ovirt tmp]# What about config above, related to eventual optimizations to be done based on having single host? And comparing with the virt group of options: [root@ovirt tmp]# cat /var/lib/glusterd/groups/virt performance.quick-read=off performance.read-ahead=off performance.io-cache=off performance.low-prio-threads=32 network.remote-dio=enable cluster.eager-lock=enable cluster.quorum-type=auto cluster.server-quorum-type=server cluster.data-self-heal-algorithm=full cluster.locking-scheme=granular cluster.shd-max-threads=8 cluster.shd-wait-qlength=10000 features.shard=on user.cifs=off cluster.choose-local=off client.event-threads=4 server.event-threads=4 performance.client-io-threads=on [root@ovirt tmp]# ? Thanks Gianluca

On Sat, Mar 28, 2020 at 2:21 AM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Hello, having deployed oVirt 4.3.9 single host HCI with Gluster, I see some times VM going into paused state for the error above and needing to manually run it (sometimes this resumal operation fails). Actually it only happened with empty disk (thin provisioned) and sudden high I/O during the initial phase of install of the OS; it didn't happened then during normal operaton (even with 600MB/s of throughput). I suspect something related to metadata extension not able to be in pair with the speed of the physical disk growing.... similar to what happens for block based storage domains where the LVM layer has to extend the logical volume representing the virtual disk
My real world reproduction of the error is during install of OCP 4.3.8 master node, when Red Hat Cores OS boots from network and wipes the disk and I think then transfer an image, so doing high immediate I/O. The VM used as master node has been created with a 120Gb thin provisioned disk (virtio-scsi type) and starts with disk just initialized and empty, going through PXE install. I get this line inside events for the VM
Mar 27, 2020, 12:35:23 AM VM master01 has been paused due to unknown storage error.
Here logs around the time frame above:
- engine.log
https://drive.google.com/file/d/1zpNo5IgFVTAlKXHiAMTL-uvaoXSNMVRO/view?usp=s...
- vdsm.log
https://drive.google.com/file/d/1v8kR0N6PdHBJ5hYzEYKl4-m7v1Lb_cYX/view?usp=s...
Any suggestions?
The disk of the VM is on vmstore storage domain and its gluster volume settings are:
[root@ovirt tmp]# gluster volume info vmstore
Volume Name: vmstore Type: Distribute Volume ID: a6203d77-3b9d-49f9-94c5-9e30562959c4 Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: ovirtst.mydomain.storage:/gluster_bricks/vmstore/vmstore Options Reconfigured: performance.low-prio-threads: 32 storage.owner-gid: 36 performance.read-ahead: off user.cifs: off storage.owner-uid: 36 performance.io-cache: off performance.quick-read: off network.ping-timeout: 30 features.shard: on network.remote-dio: off cluster.eager-lock: enable performance.strict-o-direct: on transport.address-family: inet nfs.disable: on [root@ovirt tmp]#
What about config above, related to eventual optimizations to be done based on having single host? And comparing with the virt group of options:
[root@ovirt tmp]# cat /var/lib/glusterd/groups/virt performance.quick-read=off performance.read-ahead=off performance.io-cache=off performance.low-prio-threads=32 network.remote-dio=enable cluster.eager-lock=enable cluster.quorum-type=auto cluster.server-quorum-type=server cluster.data-self-heal-algorithm=full cluster.locking-scheme=granular cluster.shd-max-threads=8 cluster.shd-wait-qlength=10000 features.shard=on user.cifs=off cluster.choose-local=off client.event-threads=4 server.event-threads=4 performance.client-io-threads=on [root@ovirt tmp]#
?
Thanks Gianluca
Further information. What I see around time frame in gluster brick log file gluster_bricks-vmstore-vmstore.log (timestamp is behind 1 hour in log file) [2020-03-27 23:30:38.575808] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-vmstore-server: Shutting down connection CTX_ID:6e8f70b8-1946-4505-860f-be90e5807cb3-GRAPH_ID:0-PID:223418-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0 [2020-03-27 23:35:15.281449] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument] [2020-03-27 23:35:15.281545] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-vmstore-server: 34139378: WRITEV 10 (00d9fe81-8a31-498e-8401-7b9d1477378e), client: CTX_ID:d04437ba-ef98-43df-864f-5e9d3738620a-GRAPH_ID:0-PID:27687-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-03-27 23:40:15.415794] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument] My gluster components' version: gluster-ansible-cluster-1.0.0-1.el7.noarch gluster-ansible-features-1.0.5-3.el7.noarch gluster-ansible-infra-1.0.4-3.el7.noarch gluster-ansible-maintenance-1.0.1-1.el7.noarch gluster-ansible-repositories-1.0.1-1.el7.noarch gluster-ansible-roles-1.0.5-7.el7.noarch glusterfs-6.8-1.el7.x86_64 glusterfs-api-6.8-1.el7.x86_64 glusterfs-cli-6.8-1.el7.x86_64 glusterfs-client-xlators-6.8-1.el7.x86_64 glusterfs-events-6.8-1.el7.x86_64 glusterfs-fuse-6.8-1.el7.x86_64 glusterfs-geo-replication-6.8-1.el7.x86_64 glusterfs-libs-6.8-1.el7.x86_64 glusterfs-rdma-6.8-1.el7.x86_64 glusterfs-server-6.8-1.el7.x86_64 libvirt-daemon-driver-storage-gluster-4.5.0-23.el7_7.6.x86_64 python2-gluster-6.8-1.el7.x86_64 vdsm-gluster-4.30.43-1.el7.x86_64 And for completeness, the whole set of parameters for the volume, to ask eventual further performance related suggestions considering single node environment: [root@ovirt tmp]# gluster volume get vmstore all Option Value ------ ----- cluster.lookup-unhashed on cluster.lookup-optimize on cluster.min-free-disk 10% cluster.min-free-inodes 5% cluster.rebalance-stats off cluster.subvols-per-directory (null) cluster.readdir-optimize off cluster.rsync-hash-regex (null) cluster.extra-hash-regex (null) cluster.dht-xattr-name trusted.glusterfs.dht cluster.randomize-hash-range-by-gfid off cluster.rebal-throttle normal cluster.lock-migration off cluster.force-migration off cluster.local-volume-name (null) cluster.weighted-rebalance on cluster.switch-pattern (null) cluster.entry-change-log on cluster.read-subvolume (null) cluster.read-subvolume-index -1 cluster.read-hash-mode 1 cluster.background-self-heal-count 8 cluster.metadata-self-heal off cluster.data-self-heal off cluster.entry-self-heal off cluster.self-heal-daemon on cluster.heal-timeout 600 cluster.self-heal-window-size 1 cluster.data-change-log on cluster.metadata-change-log on cluster.data-self-heal-algorithm (null) cluster.eager-lock enable disperse.eager-lock on disperse.other-eager-lock on disperse.eager-lock-timeout 1 disperse.other-eager-lock-timeout 1 cluster.quorum-type none cluster.quorum-count (null) cluster.choose-local true cluster.self-heal-readdir-size 1KB cluster.post-op-delay-secs 1 cluster.ensure-durability on cluster.consistent-metadata no cluster.heal-wait-queue-length 128 cluster.favorite-child-policy none cluster.full-lock yes diagnostics.latency-measurement off diagnostics.dump-fd-stats off diagnostics.count-fop-hits off diagnostics.brick-log-level INFO diagnostics.client-log-level INFO diagnostics.brick-sys-log-level CRITICAL diagnostics.client-sys-log-level CRITICAL diagnostics.brick-logger (null) diagnostics.client-logger (null) diagnostics.brick-log-format (null) diagnostics.client-log-format (null) diagnostics.brick-log-buf-size 5 diagnostics.client-log-buf-size 5 diagnostics.brick-log-flush-timeout 120 diagnostics.client-log-flush-timeout 120 diagnostics.stats-dump-interval 0 diagnostics.fop-sample-interval 0 diagnostics.stats-dump-format json diagnostics.fop-sample-buf-size 65535 diagnostics.stats-dnscache-ttl-sec 86400 performance.cache-max-file-size 0 performance.cache-min-file-size 0 performance.cache-refresh-timeout 1 performance.cache-priority performance.cache-size 32MB performance.io-thread-count 16 performance.high-prio-threads 16 performance.normal-prio-threads 16 performance.low-prio-threads 32 performance.least-prio-threads 1 performance.enable-least-priority on performance.iot-watchdog-secs (null) performance.iot-cleanup-disconnected-reqsoff performance.iot-pass-through false performance.io-cache-pass-through false performance.cache-size 128MB performance.qr-cache-timeout 1 performance.cache-invalidation false performance.ctime-invalidation false performance.flush-behind on performance.nfs.flush-behind on performance.write-behind-window-size 1MB performance.resync-failed-syncs-after-fsyncoff performance.nfs.write-behind-window-size1MB performance.strict-o-direct on performance.nfs.strict-o-direct off performance.strict-write-ordering off performance.nfs.strict-write-ordering off performance.write-behind-trickling-writeson performance.aggregate-size 128KB performance.nfs.write-behind-trickling-writeson performance.lazy-open yes performance.read-after-open yes performance.open-behind-pass-through false performance.read-ahead-page-count 4 performance.read-ahead-pass-through false performance.readdir-ahead-pass-through false performance.md-cache-pass-through false performance.md-cache-timeout 1 performance.cache-swift-metadata true performance.cache-samba-metadata false performance.cache-capability-xattrs true performance.cache-ima-xattrs true performance.md-cache-statfs off performance.xattr-cache-list performance.nl-cache-pass-through false features.encryption off network.frame-timeout 1800 network.ping-timeout 30 network.tcp-window-size (null) client.ssl off network.remote-dio off client.event-threads 2 client.tcp-user-timeout 0 client.keepalive-time 20 client.keepalive-interval 2 client.keepalive-count 9 network.tcp-window-size (null) network.inode-lru-limit 16384 auth.allow * auth.reject (null) transport.keepalive 1 server.allow-insecure on server.root-squash off server.all-squash off server.anonuid 65534 server.anongid 65534 server.statedump-path /var/run/gluster server.outstanding-rpc-limit 64 server.ssl off auth.ssl-allow * server.manage-gids off server.dynamic-auth on client.send-gids on server.gid-timeout 300 server.own-thread (null) server.event-threads 2 server.tcp-user-timeout 42 server.keepalive-time 20 server.keepalive-interval 2 server.keepalive-count 9 transport.listen-backlog 1024 transport.address-family inet performance.write-behind on performance.read-ahead off performance.readdir-ahead on performance.io-cache off performance.open-behind on performance.quick-read off performance.nl-cache off performance.stat-prefetch on performance.client-io-threads on performance.nfs.write-behind on performance.nfs.read-ahead off performance.nfs.io-cache off performance.nfs.quick-read off performance.nfs.stat-prefetch off performance.nfs.io-threads off performance.force-readdirp true performance.cache-invalidation false performance.global-cache-invalidation true features.uss off features.snapshot-directory .snaps features.show-snapshot-directory off features.tag-namespaces off network.compression off network.compression.window-size -15 network.compression.mem-level 8 network.compression.min-size 0 network.compression.compression-level -1 network.compression.debug false features.default-soft-limit 80% features.soft-timeout 60 features.hard-timeout 5 features.alert-time 86400 features.quota-deem-statfs off geo-replication.indexing off geo-replication.indexing off geo-replication.ignore-pid-check off geo-replication.ignore-pid-check off features.quota off features.inode-quota off features.bitrot disable debug.trace off debug.log-history no debug.log-file no debug.exclude-ops (null) debug.include-ops (null) debug.error-gen off debug.error-failure (null) debug.error-number (null) debug.random-failure off debug.error-fops (null) nfs.disable on features.read-only off features.worm off features.worm-file-level off features.worm-files-deletable on features.default-retention-period 120 features.retention-mode relax features.auto-commit-period 180 storage.linux-aio off storage.batch-fsync-mode reverse-fsync storage.batch-fsync-delay-usec 0 storage.owner-uid 36 storage.owner-gid 36 storage.node-uuid-pathinfo off storage.health-check-interval 30 storage.build-pgfid off storage.gfid2path on storage.gfid2path-separator : storage.reserve 1 storage.health-check-timeout 10 storage.fips-mode-rchecksum off storage.force-create-mode 0000 storage.force-directory-mode 0000 storage.create-mask 0777 storage.create-directory-mask 0777 storage.max-hardlinks 100 features.ctime on config.gfproxyd off cluster.server-quorum-type off cluster.server-quorum-ratio 0 changelog.changelog off changelog.changelog-dir {{ brick.path }}/.glusterfs/changelogs changelog.encoding ascii changelog.rollover-time 15 changelog.fsync-interval 5 changelog.changelog-barrier-timeout 120 changelog.capture-del-path off features.barrier disable features.barrier-timeout 120 features.trash off features.trash-dir .trashcan features.trash-eliminate-path (null) features.trash-max-filesize 5MB features.trash-internal-op off cluster.enable-shared-storage disable locks.trace off locks.mandatory-locking off cluster.disperse-self-heal-daemon enable cluster.quorum-reads no client.bind-insecure (null) features.shard on features.shard-block-size 64MB features.shard-lru-limit 16384 features.shard-deletion-rate 100 features.scrub-throttle lazy features.scrub-freq biweekly features.scrub false features.expiry-time 120 features.cache-invalidation off features.cache-invalidation-timeout 60 features.leases off features.lease-lock-recall-timeout 60 disperse.background-heals 8 disperse.heal-wait-qlength 128 cluster.heal-timeout 600 dht.force-readdirp on disperse.read-policy gfid-hash cluster.shd-max-threads 1 cluster.shd-wait-qlength 1024 cluster.locking-scheme full cluster.granular-entry-heal no features.locks-revocation-secs 0 features.locks-revocation-clear-all false features.locks-revocation-max-blocked 0 features.locks-monkey-unlocking false features.locks-notify-contention no features.locks-notify-contention-delay 5 disperse.shd-max-threads 1 disperse.shd-wait-qlength 1024 disperse.cpu-extensions auto disperse.self-heal-window-size 1 cluster.use-compound-fops off performance.parallel-readdir off performance.rda-request-size 131072 performance.rda-low-wmark 4096 performance.rda-high-wmark 128KB performance.rda-cache-limit 10MB performance.nl-cache-positive-entry false performance.nl-cache-limit 10MB performance.nl-cache-timeout 60 cluster.brick-multiplex off cluster.max-bricks-per-process 250 disperse.optimistic-change-log on disperse.stripe-cache 4 cluster.halo-enabled False cluster.halo-shd-max-latency 99999 cluster.halo-nfsd-max-latency 5 cluster.halo-max-latency 5 cluster.halo-max-replicas 99999 cluster.halo-min-replicas 2 features.selinux on cluster.daemon-log-level INFO debug.delay-gen off delay-gen.delay-percentage 10% delay-gen.delay-duration 100000 delay-gen.enable disperse.parallel-writes on features.sdfs off features.cloudsync off features.ctime on ctime.noatime on feature.cloudsync-storetype (null) features.enforce-mandatory-lock off [root@ovirt tmp]# Gianluca

On Sat, Mar 28, 2020 at 5:00 AM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote: ...
Further information. What I see around time frame in gluster brick log file gluster_bricks-vmstore-vmstore.log (timestamp is behind 1 hour in log file)
[2020-03-27 23:30:38.575808] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-vmstore-server: Shutting down connection CTX_ID:6e8f70b8-1946-4505-860f-be90e5807cb3-GRAPH_ID:0-PID:223418-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0 [2020-03-27 23:35:15.281449] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument] [2020-03-27 23:35:15.281545] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-vmstore-server: 34139378: WRITEV 10 (00d9fe81-8a31-498e-8401-7b9d1477378e), client: CTX_ID:d04437ba-ef98-43df-864f-5e9d3738620a-GRAPH_ID:0-PID:27687-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-03-27 23:40:15.415794] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument]
Invalid arguments are expected when activating a storage domain, and every 5 minutes when storage domain are refreshed. The writes are performed to to a temporary file at /rhev/data-center/mnt/server:_path/.prob-random-uuid These logs do not show the path, so we don't know if the writes are related to block size probing. But in vdsm log we see: 2020-03-27 00:40:08,979+0100 INFO (monitor/665ff83) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) This call happens when vdsm is refreshing storage domain. Right before this log, vdsm try to detect the underlying storage block size. So it looks like the gluster logs are related to block size probing and are not related to the I/O error that caused the VM to pause. Looking at both "abnormal vm stop" and storage refresh events: $ egrep 'Removing remnants of deleted images|abnormal vm stop' vdsm.log.18 2020-03-27 00:20:08,555+0100 INFO (monitor/665ff83) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:24:38,254+0100 INFO (monitor/0cb6ade) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:24:47,799+0100 INFO (monitor/81b9724) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:25:08,660+0100 INFO (monitor/665ff83) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:29:38,344+0100 INFO (monitor/0cb6ade) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:29:47,901+0100 INFO (monitor/81b9724) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:30:08,761+0100 INFO (monitor/665ff83) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:34:38,436+0100 INFO (monitor/0cb6ade) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:34:48,004+0100 INFO (monitor/81b9724) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:35:08,877+0100 INFO (monitor/665ff83) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:35:23,817+0100 INFO (libvirt/events) [virt.vm] (vmId='1abeafb6-72b2-4893-9cc5-41846b737670') abnormal vm stop device ua-3b753210-09f5-4c40-90fb-ded93b00d19f error eother (vm:5079) VM disk is on: /rhev/data-center/mnt/glusterSD/ovirtst.mydomain.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/3b753210-09f5-4c40-90fb-ded93b00d19f/5a4aac90-a455-41b4-80dd-cf8c1ed81893 2020-03-27 00:39:38,536+0100 INFO (monitor/0cb6ade) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:39:48,089+0100 INFO (monitor/81b9724) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:40:08,979+0100 INFO (monitor/665ff83) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:44:38,623+0100 INFO (monitor/0cb6ade) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:44:48,166+0100 INFO (monitor/81b9724) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:45:09,107+0100 INFO (monitor/665ff83) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:49:38,719+0100 INFO (monitor/0cb6ade) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:49:48,246+0100 INFO (monitor/81b9724) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:50:09,234+0100 INFO (monitor/665ff83) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:53:07,405+0100 INFO (libvirt/events) [virt.vm] (vmId='6a779081-abd1-4bb1-9e9c-97f334656131') abnormal vm stop device ua-0a91c346-23a5-4432-8af7-ae0a28f9c208 error eother (vm:5079) VM disk is on: /rhev/data-center/mnt/glusterSD/ovirtst.mydomain.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/0a91c346-23a5-4432-8af7-ae0a28f9c208/2741af0b-27fe-4f7b-a8bc-8b34b9e31cb6 2020-03-27 00:54:15,546+0100 INFO (libvirt/events) [virt.vm] (vmId='0df701f4-2ae2-4b39-9445-d011673ddf28') abnormal vm stop device ua-73776048-27ba-412d-a01c-9a46d267401c error eother (vm:5079) VM disk is on: /rhev/data-center/mnt/glusterSD/ovirtst.mydomain.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/73776048-27ba-412d-a01c-9a46d267401c/b3bc914f-8882-43e1-9dd0-dd7e43d60084 2020-03-27 00:54:38,830+0100 INFO (monitor/0cb6ade) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:54:48,344+0100 INFO (monitor/81b9724) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:55:09,361+0100 INFO (monitor/665ff83) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:59:38,933+0100 INFO (monitor/0cb6ade) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 00:59:48,431+0100 INFO (monitor/81b9724) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) 2020-03-27 01:00:09,474+0100 INFO (monitor/665ff83) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726) I don't see any relation between refreshes and the abnormal vm stop events. I think the key to understanding this is to enable more verbose logs in gluster understand what was the failure that caused the vm to stop. It would also help if we had detailed error logs in qemu log in /var/log/libvirt/qemu/vm-name.log Did you enable libvirt logs? We may have more information about the error there. You can enable logs by modifying these lines in /etc/libvirt/libvirtd.conf: log_filters="1:qemu 1:libvirt 4:object 4:json 4:event 1:util" log_outputs="1:file:/var/log/libvirt/libvirtd.log" And restart libvirt. Note that libvirt log may be huge, so you need to watch it and change the log level or filter after you collect what you need. To log only warning and errors use: log_outputs="3:file:/var/log/libvirt/libvirtd.log" Someone from gluster should help debugging this. Nir

On Sat, Mar 28, 2020 at 7:34 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Sat, Mar 28, 2020 at 5:00 AM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote: ...
Further information. What I see around time frame in gluster brick log file gluster_bricks-vmstore-vmstore.log (timestamp is behind 1 hour in log file)
[2020-03-27 23:30:38.575808] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-vmstore-server: Shutting down connection CTX_ID:6e8f70b8-1946-4505-860f-be90e5807cb3-GRAPH_ID:0-PID:223418-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0 [2020-03-27 23:35:15.281449] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument] [2020-03-27 23:35:15.281545] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-vmstore-server: 34139378: WRITEV 10 (00d9fe81-8a31-498e-8401-7b9d1477378e), client: CTX_ID:d04437ba-ef98-43df-864f-5e9d3738620a-GRAPH_ID:0-PID:27687-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-03-27 23:40:15.415794] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument]
Invalid arguments are expected when activating a storage domain, and every 5 minutes when storage domain are refreshed. The writes are performed to to a temporary file at /rhev/data-center/mnt/server:_path/.prob-random-uuid
These logs do not show the path, so we don't know if the writes are related to block size probing.
But in vdsm log we see:
2020-03-27 00:40:08,979+0100 INFO (monitor/665ff83) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726)
This call happens when vdsm is refreshing storage domain. Right before this log, vdsm try to detect the underlying storage block size.
So it looks like the gluster logs are related to block size probing and are not related to the I/O error that caused the VM to pause.
[snip]
Looking at both "abnormal vm stop" and storage refresh events:
[snip]
I don't see any relation between refreshes and the abnormal vm stop events.
I think the key to understanding this is to enable more verbose logs in gluster understand what was the failure that caused the vm to stop.
Ah, ok. Thanks It seems default gluster logs level are INFO and I can have them more verbose for a limited amount of time seeing if more information is provided. Can I do it with VMs running and only doing sort of reload of the service or do I have to stop all to do it?
It would also help if we had detailed error logs in qemu log in /var/log/libvirt/qemu/vm-name.log
I will find them. The system is not available to check right now
Did you enable libvirt logs? We may have more information about the error there.
You can enable logs by modifying these lines in /etc/libvirt/libvirtd.conf:
log_filters="1:qemu 1:libvirt 4:object 4:json 4:event 1:util" log_outputs="1:file:/var/log/libvirt/libvirtd.log"
And restart libvirt.
Note that libvirt log may be huge, so you need to watch it and change the log level or filter after you collect what you need.
To log only warning and errors use:
log_outputs="3:file:/var/log/libvirt/libvirtd.log"
Someone from gluster should help debugging this.
Nir
Ok, I could also try this way if enabling more verbose gluster logs is not sufficient. Thanks for all detailed explanations, in the mean time. Gianluca

On Sun, Mar 29, 2020 at 2:42 AM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Sat, Mar 28, 2020 at 7:34 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Sat, Mar 28, 2020 at 5:00 AM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote: ...
Further information. What I see around time frame in gluster brick log file gluster_bricks-vmstore-vmstore.log (timestamp is behind 1 hour in log file)
[2020-03-27 23:30:38.575808] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-vmstore-server: Shutting down connection CTX_ID:6e8f70b8-1946-4505-860f-be90e5807cb3-GRAPH_ID:0-PID:223418-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0 [2020-03-27 23:35:15.281449] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument] [2020-03-27 23:35:15.281545] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-vmstore-server: 34139378: WRITEV 10 (00d9fe81-8a31-498e-8401-7b9d1477378e), client: CTX_ID:d04437ba-ef98-43df-864f-5e9d3738620a-GRAPH_ID:0-PID:27687-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-03-27 23:40:15.415794] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument]
Invalid arguments are expected when activating a storage domain, and every 5 minutes when storage domain are refreshed. The writes are performed to to a temporary file at /rhev/data-center/mnt/server:_path/.prob-random-uuid
These logs do not show the path, so we don't know if the writes are related to block size probing.
But in vdsm log we see:
2020-03-27 00:40:08,979+0100 INFO (monitor/665ff83) [storage.StorageDomain] Removing remnants of deleted images [] (fileSD:726)
This call happens when vdsm is refreshing storage domain. Right before this log, vdsm try to detect the underlying storage block size.
So it looks like the gluster logs are related to block size probing and are not related to the I/O error that caused the VM to pause.
[snip]
Looking at both "abnormal vm stop" and storage refresh events:
[snip]
I don't see any relation between refreshes and the abnormal vm stop events.
I think the key to understanding this is to enable more verbose logs in gluster understand what was the failure that caused the vm to stop.
Ah, ok. Thanks It seems default gluster logs level are INFO and I can have them more verbose for a limited amount of time seeing if more information is provided. Can I do it with VMs running and only doing sort of reload of the service or do I have to stop all to do it?
I don't know about gluster logs, you obviously cannot stop the server or the mount helper on the client side to change log level, and I don't know if they support reloading configuration while running. Asking on gluster mailing list will help.
It would also help if we had detailed error logs in qemu log in /var/log/libvirt/qemu/vm-name.log
I will find them. The system is not available to check right now
Did you enable libvirt logs? We may have more information about the error there.
You can enable logs by modifying these lines in /etc/libvirt/libvirtd.conf:
log_filters="1:qemu 1:libvirt 4:object 4:json 4:event 1:util" log_outputs="1:file:/var/log/libvirt/libvirtd.log"
And restart libvirt.
Note that libvirt log may be huge, so you need to watch it and change the log level or filter after you collect what you need.
To log only warning and errors use:
log_outputs="3:file:/var/log/libvirt/libvirtd.log"
Someone from gluster should help debugging this.
Nir
Ok, I could also try this way if enabling more verbose gluster logs is not sufficient.
Enabling warnings in libvirt logs is probably wanted anyway. The warnings and errors can add more info about this failure. See this for changing libvirt log level without restarting libvirt: https://wiki.libvirt.org/page/DebugLogs#Runtime_setting Nir

On March 28, 2020 3:21:45 AM GMT+02:00, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
Hello, having deployed oVirt 4.3.9 single host HCI with Gluster, I see some times VM going into paused state for the error above and needing to manually run it (sometimes this resumal operation fails). Actually it only happened with empty disk (thin provisioned) and sudden high I/O during the initial phase of install of the OS; it didn't happened then during normal operaton (even with 600MB/s of throughput). I suspect something related to metadata extension not able to be in pair with the speed of the physical disk growing.... similar to what happens for block based storage domains where the LVM layer has to extend the logical volume representing the virtual disk
My real world reproduction of the error is during install of OCP 4.3.8 master node, when Red Hat Cores OS boots from network and wipes the disk and I think then transfer an image, so doing high immediate I/O. The VM used as master node has been created with a 120Gb thin provisioned disk (virtio-scsi type) and starts with disk just initialized and empty, going through PXE install. I get this line inside events for the VM
Mar 27, 2020, 12:35:23 AM VM master01 has been paused due to unknown storage error.
Here logs around the time frame above:
- engine.log https://drive.google.com/file/d/1zpNo5IgFVTAlKXHiAMTL-uvaoXSNMVRO/view?usp=s...
- vdsm.log https://drive.google.com/file/d/1v8kR0N6PdHBJ5hYzEYKl4-m7v1Lb_cYX/view?usp=s...
Any suggestions?
The disk of the VM is on vmstore storage domain and its gluster volume settings are:
[root@ovirt tmp]# gluster volume info vmstore
Volume Name: vmstore Type: Distribute Volume ID: a6203d77-3b9d-49f9-94c5-9e30562959c4 Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: ovirtst.mydomain.storage:/gluster_bricks/vmstore/vmstore Options Reconfigured: performance.low-prio-threads: 32 storage.owner-gid: 36 performance.read-ahead: off user.cifs: off storage.owner-uid: 36 performance.io-cache: off performance.quick-read: off network.ping-timeout: 30 features.shard: on network.remote-dio: off cluster.eager-lock: enable performance.strict-o-direct: on transport.address-family: inet nfs.disable: on [root@ovirt tmp]#
What about config above, related to eventual optimizations to be done based on having single host? And comparing with the virt group of options:
[root@ovirt tmp]# cat /var/lib/glusterd/groups/virt performance.quick-read=off performance.read-ahead=off performance.io-cache=off performance.low-prio-threads=32 network.remote-dio=enable cluster.eager-lock=enable cluster.quorum-type=auto cluster.server-quorum-type=server cluster.data-self-heal-algorithm=full cluster.locking-scheme=granular cluster.shd-max-threads=8 cluster.shd-wait-qlength=10000 features.shard=on user.cifs=off cluster.choose-local=off client.event-threads=4 server.event-threads=4 performance.client-io-threads=on [root@ovirt tmp]#
?
Thanks Gianluca
Hi Gianluca, Is it happening to machines with preallocated disks or on machines with thin disks ? Best Regards, Strahil Nikolov

On Sat, Mar 28, 2020 at 8:39 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
On March 28, 2020 3:21:45 AM GMT+02:00, Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:
[snip]
Actually it only happened with empty disk (thin provisioned) and sudden
high I/O during the initial phase of install of the OS; it didn't happened then during normal operaton (even with 600MB/s of throughput).
[snip]
Hi Gianluca,
Is it happening to machines with preallocated disks or on machines with thin disks ?
Best Regards, Strahil Nikolov
thin provisioned. But as I have tro create many VMs with 120Gb of disk size of which probably only a part during time will be allocated, it would be unfeasible to make them all preallocated. I learned that thin is not good for block based storage domains and heavy I/O, but I would hope that it is not the same with file based storage domains... Thanks, Gianluca

On March 28, 2020 11:03:54 AM GMT+02:00, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Sat, Mar 28, 2020 at 8:39 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
On March 28, 2020 3:21:45 AM GMT+02:00, Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:
[snip]
Actually it only happened with empty disk (thin provisioned) and sudden
high I/O during the initial phase of install of the OS; it didn't happened then during normal operaton (even with 600MB/s of throughput).
[snip]
Hi Gianluca,
Is it happening to machines with preallocated disks or on machines with thin disks ?
Best Regards, Strahil Nikolov
thin provisioned. But as I have tro create many VMs with 120Gb of disk size of which probably only a part during time will be allocated, it would be unfeasible to make them all preallocated. I learned that thin is not good for block based storage domains and heavy I/O, but I would hope that it is not the same with file based storage domains... Thanks, Gianluca
This is normal - gluster cannot allocate fast enough the needed shards (due to high IO), so the qemu pauses the VM until storage is available again . You can think about VDO (with deduplication ) as a PV for the Thin LVM and this way you can preallocate your VMs , while saving space (deduplication, zero-block elimination and even compression). Of course, VDO will reduce performance (unless you have battery-backed write cache and compression is disabled), but tbe benefits will be alot more. Another approach is to increase the shard size - so gluster will create fewer shards, but allocation on disk will be higher. Best Regards, Strahil Nikolov

On Sat, Mar 28, 2020 at 1:59 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
On March 28, 2020 11:03:54 AM GMT+02:00, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Sat, Mar 28, 2020 at 8:39 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
On March 28, 2020 3:21:45 AM GMT+02:00, Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:
[snip]
Actually it only happened with empty disk (thin provisioned) and sudden
high I/O during the initial phase of install of the OS; it didn't happened then during normal operaton (even with 600MB/s of throughput).
[snip]
Hi Gianluca,
Is it happening to machines with preallocated disks or on machines with thin disks ?
Best Regards, Strahil Nikolov
thin provisioned. But as I have tro create many VMs with 120Gb of disk size of which probably only a part during time will be allocated, it would be unfeasible to make them all preallocated. I learned that thin is not good for block based storage domains and heavy I/O, but I would hope that it is not the same with file based storage domains... Thanks, Gianluca
This is normal - gluster cannot allocate fast enough the needed shards (due to high IO), so the qemu pauses the VM until storage is available again .
I don't know glusterfs internals, but I think this is very unlikely. For block storage thin provisioning in vdsm, vdsm is responsible for allocating more space, but vdsm is not in the datapath, it is monitoring the allocation and allocate more data when free space reaches a limit. It has no way to block I/O before more space is available. Gluster is in the datapath and can block I/O until it can process it. Can you explain what is the source for this theory?
You can think about VDO (with deduplication ) as a PV for the Thin LVM and this way you can preallocate your VMs , while saving space (deduplication, zero-block elimination and even compression). Of course, VDO will reduce performance (unless you have battery-backed write cache and compression is disabled), but tbe benefits will be alot more.
Another approach is to increase the shard size - so gluster will create fewer shards, but allocation on disk will be higher.
Best Regards, Strahil Nikolov _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/77DYUF7A5D6BIA...

Nic, I didn’t see what version of gluster you were running? There was a leak that caused similar behavior for me in early 6.x versions, but it was fixed in 6.6 (I think, you’d have to find it in the bugzilla to be sure) and I havn’t seen this in a while. Not sure it’s exactly your symptoms (mine would pause after a while running, not immediately), but might be worth checking on. -Darrell
On Mar 28, 2020, at 12:26 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Sat, Mar 28, 2020 at 1:59 PM Strahil Nikolov <hunter86_bg@yahoo.com <mailto:hunter86_bg@yahoo.com>> wrote:
On March 28, 2020 11:03:54 AM GMT+02:00, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Sat, Mar 28, 2020 at 8:39 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
On March 28, 2020 3:21:45 AM GMT+02:00, Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:
[snip]
Actually it only happened with empty disk (thin provisioned) and sudden
high I/O during the initial phase of install of the OS; it didn't happened then during normal operaton (even with 600MB/s of throughput).
[snip]
Hi Gianluca,
Is it happening to machines with preallocated disks or on machines with thin disks ?
Best Regards, Strahil Nikolov
thin provisioned. But as I have tro create many VMs with 120Gb of disk size of which probably only a part during time will be allocated, it would be unfeasible to make them all preallocated. I learned that thin is not good for block based storage domains and heavy I/O, but I would hope that it is not the same with file based storage domains... Thanks, Gianluca
This is normal - gluster cannot allocate fast enough the needed shards (due to high IO), so the qemu pauses the VM until storage is available again .
I don't know glusterfs internals, but I think this is very unlikely.
For block storage thin provisioning in vdsm, vdsm is responsible for allocating more space, but vdsm is not in the datapath, it is monitoring the allocation and allocate more data when free space reaches a limit. It has no way to block I/O before more space is available. Gluster is in the datapath and can block I/O until it can process it.
Can you explain what is the source for this theory?
You can think about VDO (with deduplication ) as a PV for the Thin LVM and this way you can preallocate your VMs , while saving space (deduplication, zero-block elimination and even compression). Of course, VDO will reduce performance (unless you have battery-backed write cache and compression is disabled), but tbe benefits will be alot more.
Another approach is to increase the shard size - so gluster will create fewer shards, but allocation on disk will be higher.
Best Regards, Strahil Nikolov _______________________________________________ Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html <https://www.ovirt.org/privacy-policy.html> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ <https://www.ovirt.org/community/about/community-guidelines/> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/77DYUF7A5D6BIA... <https://lists.ovirt.org/archives/list/users@ovirt.org/message/77DYUF7A5D6BIAYGVCBDKRBX2YWWJDJ4/>
Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html <https://www.ovirt.org/privacy-policy.html> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ <https://www.ovirt.org/community/about/community-guidelines/> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/2LC5HGDMXJPOMV... <https://lists.ovirt.org/archives/list/users@ovirt.org/message/2LC5HGDMXJPOMVIYABLM77BRWG6LYOZJ/>

On Sat, Mar 28, 2020 at 7:02 PM Darrell Budic <budic@onholyground.com> wrote:
Nic,
I didn’t see what version of gluster you were running? There was a leak that caused similar behavior for me in early 6.x versions, but it was fixed in 6.6 (I think, you’d have to find it in the bugzilla to be sure) and I havn’t seen this in a while. Not sure it’s exactly your symptoms (mine would pause after a while running, not immediately), but might be worth checking on.
-Darrell
My gluster components' version is what provided by oVirt 4.3.9 ovirtnode-ng: gluster-ansible-cluster-1.0.0-1.el7.noarch gluster-ansible-features-1.0.5-3.el7.noarch gluster-ansible-infra-1.0.4-3.el7.noarch gluster-ansible-maintenance-1.0.1-1.el7.noarch gluster-ansible-repositories-1.0.1-1.el7.noarch gluster-ansible-roles-1.0.5-7.el7.noarch glusterfs-6.8-1.el7.x86_64 glusterfs-api-6.8-1.el7.x86_64 glusterfs-cli-6.8-1.el7.x86_64 glusterfs-client-xlators-6.8-1.el7.x86_64 glusterfs-events-6.8-1.el7.x86_64 glusterfs-fuse-6.8-1.el7.x86_64 glusterfs-geo-replication-6.8-1.el7.x86_64 glusterfs-libs-6.8-1.el7.x86_64 glusterfs-rdma-6.8-1.el7.x86_64 glusterfs-server-6.8-1.el7.x86_64 libvirt-daemon-driver-storage-gluster-4.5.0-23.el7_7.6.x86_64 python2-gluster-6.8-1.el7.x86_64 vdsm-gluster-4.30.43-1.el7.x86_64 Have you details on bugzilla number? Gianluca

On March 28, 2020 7:26:33 PM GMT+02:00, Nir Soffer <nsoffer@redhat.com> wrote:
On Sat, Mar 28, 2020 at 1:59 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
On March 28, 2020 11:03:54 AM GMT+02:00, Gianluca Cecchi
<gianluca.cecchi@gmail.com> wrote:
On Sat, Mar 28, 2020 at 8:39 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
On March 28, 2020 3:21:45 AM GMT+02:00, Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:
[snip]
Actually it only happened with empty disk (thin provisioned) and sudden
high I/O during the initial phase of install of the OS; it didn't happened then during normal operaton (even with 600MB/s of throughput).
[snip]
Hi Gianluca,
Is it happening to machines with preallocated disks or on machines with thin disks ?
Best Regards, Strahil Nikolov
thin provisioned. But as I have tro create many VMs with 120Gb of disk size of which probably only a part during time will be allocated, it would be unfeasible to make them all preallocated. I learned that thin is not good for block based storage domains and heavy I/O, but I would hope that it is not the same with file based storage domains... Thanks, Gianluca
This is normal - gluster cannot allocate fast enough the needed shards (due to high IO), so the qemu pauses the VM until storage is available again .
I don't know glusterfs internals, but I think this is very unlikely.
For block storage thin provisioning in vdsm, vdsm is responsible for allocating more space, but vdsm is not in the datapath, it is monitoring the allocation and allocate more data when free space reaches a limit. It has no way to block I/O before more space is available. Gluster is in the datapath and can block I/O until it can process it.
Can you explain what is the source for this theory?
You can think about VDO (with deduplication ) as a PV for the Thin LVM and this way you can preallocate your VMs , while saving space (deduplication, zero-block elimination and even compression). Of course, VDO will reduce performance (unless you have battery-backed write cache and compression is disabled), but tbe benefits will be alot more.
Another approach is to increase the shard size - so gluster will create fewer shards, but allocation on disk will be higher.
Best Regards, Strahil Nikolov _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/77DYUF7A5D6BIA...
Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/2LC5HGDMXJPOMV...
Hey Nir, You are right ... This is just a theory based on my knowledge and it might not be valid. We nees the libvirt logs to confirm or reject the theory, but I'm convinced that is the reason. Yet, it's quite possible. Qemu tries to write to the qcow disk on gluster. Gluster is creating shards based of the ofset, as it was not done initially (preallocated disk take the full size on gluster and all shards are created immediately). This takes time and requires to be done on all bricks. As the shard size is too small (default 64MB), gluster has to create the next shard almost immediately, but if it can't do it as fast as qemu is filling it's qcow2 disk - qemu will get an I/O error and we know what happens there. Later gluster manages to create the shard(s) , and the VM is unpaused. That's why the oVirt team made all gluster-based disks to be fully preallocated. Best Regards, Strahil Nikolov

On Sat, Mar 28, 2020 at 9:47 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
On March 28, 2020 7:26:33 PM GMT+02:00, Nir Soffer <nsoffer@redhat.com> wrote:
On Sat, Mar 28, 2020 at 1:59 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
On March 28, 2020 11:03:54 AM GMT+02:00, Gianluca Cecchi
<gianluca.cecchi@gmail.com> wrote:
On Sat, Mar 28, 2020 at 8:39 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
On March 28, 2020 3:21:45 AM GMT+02:00, Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:
[snip]
Actually it only happened with empty disk (thin provisioned) and sudden
high I/O during the initial phase of install of the OS; it didn't happened then during normal operaton (even with 600MB/s of throughput).
[snip]
Hi Gianluca,
Is it happening to machines with preallocated disks or on machines with thin disks ?
Best Regards, Strahil Nikolov
thin provisioned. But as I have tro create many VMs with 120Gb of disk size of which probably only a part during time will be allocated, it would be unfeasible to make them all preallocated. I learned that thin is not good for block based storage domains and heavy I/O, but I would hope that it is not the same with file based storage domains... Thanks, Gianluca
This is normal - gluster cannot allocate fast enough the needed shards (due to high IO), so the qemu pauses the VM until storage is available again .
I don't know glusterfs internals, but I think this is very unlikely.
For block storage thin provisioning in vdsm, vdsm is responsible for allocating more space, but vdsm is not in the datapath, it is monitoring the allocation and allocate more data when free space reaches a limit. It has no way to block I/O before more space is available. Gluster is in the datapath and can block I/O until it can process it.
Can you explain what is the source for this theory?
You can think about VDO (with deduplication ) as a PV for the Thin LVM and this way you can preallocate your VMs , while saving space (deduplication, zero-block elimination and even compression). Of course, VDO will reduce performance (unless you have battery-backed write cache and compression is disabled), but tbe benefits will be alot more.
Another approach is to increase the shard size - so gluster will create fewer shards, but allocation on disk will be higher.
Best Regards, Strahil Nikolov _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/77DYUF7A5D6BIA...
Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/2LC5HGDMXJPOMV...
Hey Nir, You are right ... This is just a theory based on my knowledge and it might not be valid. We nees the libvirt logs to confirm or reject the theory, but I'm convinced that is the reason.
Yet, it's quite possible. Qemu tries to write to the qcow disk on gluster. Gluster is creating shards based of the ofset, as it was not done initially (preallocated disk take the full size on gluster and all shards are created immediately). This takes time and requires to be done on all bricks. As the shard size is too small (default 64MB), gluster has to create the next shard almost immediately, but if it can't do it as fast as qemu is filling it's qcow2 disk
Gluster can block the I/O until it can write the data to a new shard. There is no reason to return an error unless a real error happened. Also the VMs mentioned here are using raw disks, not qcow2: <disk device="disk" snapshot="no" type="file"> <target bus="scsi" dev="sda"/> <source file="/rhev/data-center/mnt/glusterSD/ovirtst.mydomain.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/0a91c346-23a5-4432-8af7-ae0a28f9c208/2741af0b-27fe-4f7b-a8bc-8b34b9e31cb6"> <seclabel model="dac" relabel="no" type="none"/> </source> <driver cache="none" error_policy="stop" io="threads" name="qemu" type="raw"/> <alias name="ua-0a91c346-23a5-4432-8af7-ae0a28f9c208"/> <address bus="0" controller="0" target="0" type="drive" unit="0"/> <boot order="1"/> <serial>0a91c346-23a5-4432-8af7-ae0a28f9c208</serial> </disk> Note type="raw"
- qemu will get an I/O error and we know what happens there. Later gluster manages to create the shard(s) , and the VM is unpaused.
That's why the oVirt team made all gluster-based disks to be fully preallocated.
Gluster disks are thin (raw-sparse) by default just like any other file based storage. If this theory was correct, this would fail consistently on gluster: 1. create raw sparse image truncate -s 100g /rhev/data-center/mnt/glusterSD/server:_path/test 2. Fill image quickly with data dd if=/dev/zero bs=1M | tr "\0" "U" | dd of=/rhev/data-center/mnt/glusterSD/server:_path/test bs=1M count=12800 iflag=fullblock oflag=direct conv=notrunc According to your theory gluster will fail to allocate shards fast enough and fail the I/O. Nir

On Sat, Mar 28, 2020 at 8:26 PM Nir Soffer <nsoffer@redhat.com> wrote: [snip]
You are right ... This is just a theory based on my knowledge and it might not be valid. We nees the libvirt logs to confirm or reject the theory, but I'm convinced that is the reason.
Yet, it's quite possible. Qemu tries to write to the qcow disk on gluster. Gluster is creating shards based of the ofset, as it was not done initially (preallocated disk take the full size on gluster and all shards are created immediately). This takes time and requires to be done on all bricks. As the shard size is too small (default 64MB), gluster has to create
Hey Nir, the next shard almost immediately, but if it can't do it as fast as qemu is filling it's qcow2 disk
Gluster can block the I/O until it can write the data to a new shard. There is no reason to return an error unless a real error happened.
Also the VMs mentioned here are using raw disks, not qcow2:
[snip]
<target bus="scsi" dev="sda"/> <source
file="/rhev/data-center/mnt/glusterSD/ovirtst.mydomain.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/0a91c346-23a5-4432-8af7-ae0a28f9c208/2741af0b-27fe-4f7b-a8bc-8b34b9e31cb6"> <seclabel model="dac" relabel="no" type="none"/> </source> <driver cache="none" error_policy="stop" io="threads" name="qemu" type="raw"/>
[snip]
Note type="raw"
- qemu will get an I/O error and we know what happens there. Later gluster manages to create the shard(s) , and the VM is unpaused.
That's why the oVirt team made all gluster-based disks to be fully preallocated.
Yes, in my disk definition I used default proposed. Possibly I only chose virito-scsi (see the sda name): I don't remember in 4.3.9 and red hat core os as os type if virtio would be the default one or not...
Gluster disks are thin (raw-sparse) by default just like any other file based storage.
If this theory was correct, this would fail consistently on gluster:
1. create raw sparse image
truncate -s 100g /rhev/data-center/mnt/glusterSD/server:_path/test
2. Fill image quickly with data
dd if=/dev/zero bs=1M | tr "\0" "U" | dd of=/rhev/data-center/mnt/glusterSD/server:_path/test bs=1M count=12800 iflag=fullblock oflag=direct conv=notrunc
According to your theory gluster will fail to allocate shards fast enough and fail the I/O.
Nir
I can also try the commands above, just to see the behavior, and report here. As soon as I can connect to the system Gianluca

On Sat, Mar 28, 2020 at 8:26 PM Nir Soffer <nsoffer@redhat.com> wrote:
Gluster disks are thin (raw-sparse) by default just like any other file based storage.
If this theory was correct, this would fail consistently on gluster:
1. create raw sparse image
truncate -s 100g /rhev/data-center/mnt/glusterSD/server:_path/test
2. Fill image quickly with data
dd if=/dev/zero bs=1M | tr "\0" "U" | dd of=/rhev/data-center/mnt/glusterSD/server:_path/test bs=1M count=12800 iflag=fullblock oflag=direct conv=notrunc
According to your theory gluster will fail to allocate shards fast enough and fail the I/O.
Nir
Coming back here as I got some other occurrences of the same error and I'm going to try to enable some logging and see where it is the origin. In the mean time I confirmed what written by Nir above. No error on gluster side with the command suggested: [root@ovirt libvirt]# time dd if=/dev/zero bs=1M | tr "\0" "U" | dd of="/rhev/data-center/mnt/glusterSD/ovirtst.mydomain.storage:_vmstore/test" bs=1M count=12800 iflag=fullblock oflag=direct conv=notrunc 12800+0 records in 12800+0 records out 13421772800 bytes (13 GB) copied, 37.8056 s, 355 MB/s real 0m37.812s user 0m10.582s sys 0m24.072s [root@ovirt libvirt]# [root@ovirt libvirt]# echo $? 0 [root@ovirt libvirt]# During the copy a "vmstat 3" produced indeed about 350 MiB/s: [root@ovirt ~]# vmstat 3 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 8 0 0 1274344448 329220 5752988 0 0 8 142 0 0 4 2 95 0 0 12 0 0 1274341504 329220 5753008 0 0 1930 30079 76558 186828 7 3 90 0 0 3 0 0 1274340480 329220 5753028 0 0 917 15786 68796 178179 6 3 91 0 0 2 0 0 1274336128 329220 5753048 0 0 61 14219 69035 172374 6 3 90 0 0 17 2 0 1274334976 329220 5753088 0 0 1931 14814 74888 178985 7 3 89 0 0 7 0 0 1274317952 329220 5753112 0 0 1556 144318 72745 257632 7 3 89 0 0 11 1 0 1274315136 329220 5753112 0 0 91 350458 80783 410853 8 4 87 1 0 12 0 0 1274310272 329220 5753148 0 0 965 378040 83638 419936 6 4 89 1 0 10 1 0 1274308352 329220 5753188 0 0 1470 379597 82909 429217 7 4 88 1 0 7 0 0 1274318592 329220 5753192 0 0 608 368145 85389 417066 9 5 86 1 0 9 1 0 1274323584 329220 5753208 0 0 460 370369 81002 426601 7 5 88 1 0 12 2 0 1274323456 329220 5753232 0 0 2655 369143 79131 415323 7 4 88 1 0 10 2 0 1274306560 329220 5753252 0 0 625 367324 82076 419014 8 4 87 1 0 15 0 0 1274276480 329220 5753272 0 0 99 355281 81662 409003 8 4 87 1 0 12 2 0 1274259072 329220 5753292 0 0 855 358916 82942 416828 7 4 88 1 0 5 0 0 1274257024 329220 5753316 0 0 1566 374178 81489 415783 7 4 88 1 0 7 2 0 1274259968 329220 5753324 0 0 527 344149 82577 418192 7 4 88 1 0 7 0 0 1274254080 329220 5753364 0 0 1596 341075 82839 414259 6 4 89 1 0 7 1 0 1274267776 329220 5753404 0 0 1431 97078 73923 242802 6 3 90 0 0 Tried 4 times without any error and consistent timing of about 36-37 seconds Go to next tests Gianluca

OK. So I set log at least at INFO level on all subsystems and tried a redeploy of Openshift with 3 mater nodes and 7 worker nodes. One worker got the error and VM in paused mode Apr 7, 2020, 3:27:28 PM VM worker-6 has been paused due to unknown storage error. The vm has only one 100Gb virtual disk on gluster volume named vmstore Here below all the logs around time at the different layers. Let me know if you need another log file not yet considered. From what I see, the matching error is found in - rhev-data-center-mnt-glusterSD-ovirtst.mydomain.storage:_vmstore.log [2020-04-07 13:27:28.721262] E [MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on shard 523 failed. Base file gfid = d22530cf-2e50-4059-8924-0aafe38497b1 [No such file or directory] [2020-04-07 13:27:28.721432] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4435189: WRITE => -1 gfid=d22530cf-2e50-4059-8924-0aafe38497b1 fd=0x7f3c4c07ab38 (No such file or directory) and - gluster_bricks-vmstore-vmstore.log [2020-04-07 13:27:28.719391] W [MSGID: 113020] [posix-helpers.c:1051:posix_gfid_set] 0-vmstore-posix: setting GFID on /gluster_bricks/vmstore /vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] [2020-04-07 13:27:28.719978] E [MSGID: 113020] [posix-entry-ops.c:517:posix_mknod] 0-vmstore-posix: setting gfid on /gluster_bricks/vmstore/v mstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] Here below all the files checked. Any hint? Gianluca - qemu.log of the vm 2020-04-07T12:30:29.954084Z qemu-kvm: -drive file=/rhev/data-center/mnt/glusterSD/ovirtst.mydomain.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/04ec5a8d-4ee6-4661-a832-0c094968aa5b/716fdc98-2d0e-44c3-b5fe-c0187cdad751,format=raw,if=none,id=drive-ua-04ec5a8d-4ee6-4661-a832-0c094968aa5b,serial=04ec5a8d-4ee6-4661-a832-0c094968aa5b,werror=stop,rerror=stop,cache=none,aio=threads: 'serial' is deprecated, please use the corresponding option of '-device' instead Spice-Message: 14:30:29.963: setting TLS option 'CipherString' to 'kECDHE+FIPS:kDHE+FIPS:kRSA+FIPS:!eNULL:!aNULL' from /etc/pki/tls/spice.cnf configuration file 2020-04-07T12:30:29.976216Z qemu-kvm: warning: CPU(s) not present in any NUMA nodes: CPU 8 [socket-id: 2, core-id: 0, thread-id: 0], CPU 9 [socket-id: 2, core-id: 1, thread-id: 0], CPU 10 [socket-id: 2, core-id: 2, thread-id: 0], CPU 11 [socket-id: 2, core-id: 3, thread-id: 0], CPU 12 [socket-id: 3, core-id: 0, thread-id: 0], CPU 13 [socket-id: 3, core-id: 1, thread-id: 0], CPU 14 [socket-id: 3, core-id: 2, thread-id: 0], CPU 15 [socket-id: 3, core-id: 3, thread-id: 0], CPU 16 [socket-id: 4, core-id: 0, thread-id: 0], CPU 17 [socket-id: 4, core-id: 1, thread-id: 0], CPU 18 [socket-id: 4, core-id: 2, thread-id: 0], CPU 19 [socket-id: 4, core-id: 3, thread-id: 0], CPU 20 [socket-id: 5, core-id: 0, thread-id: 0], CPU 21 [socket-id: 5, core-id: 1, thread-id: 0], CPU 22 [socket-id: 5, core-id: 2, thread-id: 0], CPU 23 [socket-id: 5, core-id: 3, thread-id: 0], CPU 24 [socket-id: 6, core-id: 0, thread-id: 0], CPU 25 [socket-id: 6, core-id: 1, thread-id: 0], CPU 26 [socket-id: 6, core-id: 2, thread-id: 0], CPU 27 [socket-id: 6, core-id: 3, thread-id: 0], CPU 28 [socket-id: 7, core-id: 0, thread-id: 0], CPU 29 [socket-id: 7, core-id: 1, thread-id: 0], CPU 30 [socket-id: 7, core-id: 2, thread-id: 0], CPU 31 [socket-id: 7, core-id: 3, thread-id: 0], CPU 32 [socket-id: 8, core-id: 0, thread-id: 0], CPU 33 [socket-id: 8, core-id: 1, thread-id: 0], CPU 34 [socket-id: 8, core-id: 2, thread-id: 0], CPU 35 [socket-id: 8, core-id: 3, thread-id: 0], CPU 36 [socket-id: 9, core-id: 0, thread-id: 0], CPU 37 [socket-id: 9, core-id: 1, thread-id: 0], CPU 38 [socket-id: 9, core-id: 2, thread-id: 0], CPU 39 [socket-id: 9, core-id: 3, thread-id: 0], CPU 40 [socket-id: 10, core-id: 0, thread-id: 0], CPU 41 [socket-id: 10, core-id: 1, thread-id: 0], CPU 42 [socket-id: 10, core-id: 2, thread-id: 0], CPU 43 [socket-id: 10, core-id: 3, thread-id: 0], CPU 44 [socket-id: 11, core-id: 0, thread-id: 0], CPU 45 [socket-id: 11, core-id: 1, thread-id: 0], CPU 46 [socket-id: 11, core-id: 2, thread-id: 0], CPU 47 [socket-id: 11, core-id: 3, thread-id: 0], CPU 48 [socket-id: 12, core-id: 0, thread-id: 0], CPU 49 [socket-id: 12, core-id: 1, thread-id: 0], CPU 50 [socket-id: 12, core-id: 2, thread-id: 0], CPU 51 [socket-id: 12, core-id: 3, thread-id: 0], CPU 52 [socket-id: 13, core-id: 0, thread-id: 0], CPU 53 [socket-id: 13, core-id: 1, thread-id: 0], CPU 54 [socket-id: 13, core-id: 2, thread-id: 0], CPU 55 [socket-id: 13, core-id: 3, thread-id: 0], CPU 56 [socket-id: 14, core-id: 0, thread-id: 0], CPU 57 [socket-id: 14, core-id: 1, thread-id: 0], CPU 58 [socket-id: 14, core-id: 2, thread-id: 0], CPU 59 [socket-id: 14, core-id: 3, thread-id: 0], CPU 60 [socket-id: 15, core-id: 0, thread-id: 0], CPU 61 [socket-id: 15, core-id: 1, thread-id: 0], CPU 62 [socket-id: 15, core-id: 2, thread-id: 0], CPU 63 [socket-id: 15, core-id: 3, thread-id: 0] 2020-04-07T12:30:29.976247Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future main_channel_link: add main channel client main_channel_client_handle_pong: net test: latency 52.540000 ms, bitrate 6113597 bps (5.830380 Mbps) LOW BANDWIDTH inputs_connect: inputs channel client create red_qxl_set_cursor_peer: inputs_channel_detach_tablet: red_channel_client_disconnect: rcc=0x5626c717a9c0 (channel=0x5626c616a880 type=3 id=0) red_channel_client_disconnect: rcc=0x5626c8351dd0 (channel=0x5626c7bd38d0 type=4 id=0) red_channel_client_disconnect: rcc=0x5626c834c5b0 (channel=0x5626c616a230 type=2 id=0) red_channel_client_disconnect: rcc=0x5626c719e9f0 (channel=0x5626c616a160 type=1 id=0) main_channel_client_on_disconnect: rcc=0x5626c719e9f0 red_client_destroy: destroy client 0x5626c6195570 with #channels=4 red_qxl_disconnect_cursor_peer: red_qxl_disconnect_display_peer: inputs_channel_detach_tablet: inputs_channel_detach_tablet: main_channel_link: add main channel client main_channel_client_handle_pong: net test: latency 126.308000 ms, bitrate 9060022 bps (8.640310 Mbps) LOW BANDWIDTH red_qxl_set_cursor_peer: inputs_connect: inputs channel client create red_channel_client_disconnect: rcc=0x5626c717a9f0 (channel=0x5626c616a160 type=1 id=0) main_channel_client_on_disconnect: rcc=0x5626c717a9f0 red_client_destroy: destroy client 0x5626c7ab6870 with #channels=4 red_channel_client_disconnect: rcc=0x5626c71529c0 (channel=0x5626c616a880 type=3 id=0) red_qxl_disconnect_cursor_peer: red_channel_client_disconnect: rcc=0x5626c7e59dd0 (channel=0x5626c7bd38d0 type=4 id=0) red_qxl_disconnect_display_peer: red_channel_client_disconnect: rcc=0x5626c7e5e5b0 (channel=0x5626c616a230 type=2 id=0) [root@ovirt qemu]# - /var/log/glusterfs/cli.log [2020-04-07 13:25:39.305818] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:39.306091] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:39.306531] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:39.306758] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:39.307081] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:25:54.365843] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:25:54.448475] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:25:54.448600] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:25:54.449348] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:25:54.485288] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:25:54.566723] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:25:54.566792] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:25:54.567275] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:25:54.567305] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:25:54.603618] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:25:54.685718] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:25:54.685821] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:25:54.686537] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.686994] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.687289] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.687774] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.688061] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.688368] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:09.748782] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:09.831541] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:09.831610] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:09.832596] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:09.871269] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:09.955626] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:09.955726] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:09.956239] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:26:09.956284] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:09.993929] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:10.073634] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:10.073680] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:10.074454] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.074932] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.075241] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.075677] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.075920] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.076233] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:22.933923] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:23.016382] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:23.016486] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:23.020067] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:25.149347] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:25.229667] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:25.229716] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:25.230491] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:25.266853] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:25.348774] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:25.348815] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:25.349253] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:26:25.349284] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:25.385951] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:25.468521] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:25.468571] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:25.469286] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.469703] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.469965] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.470381] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.470595] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.470877] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:40.546233] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:40.629567] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:40.629667] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:40.630529] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:40.666769] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:40.746861] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:40.746934] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:40.747436] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:26:40.747474] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:40.789486] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:40.881750] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:40.881871] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:40.882615] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.883142] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.883532] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.884020] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.884283] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.884615] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:55.966643] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:56.049639] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:56.049735] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:56.050543] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:56.096892] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:56.182275] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:56.182335] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:56.182883] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:26:56.182918] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:56.221093] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:56.303302] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:56.303337] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:56.303996] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.304432] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.304706] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.305135] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.305359] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.305645] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:11.378699] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:11.457630] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:11.457697] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:11.458355] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:11.493643] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:11.571358] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:11.571360] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:11.572137] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:27:11.572167] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:11.608432] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:11.689353] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:11.689437] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:11.690117] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.690569] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.690910] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.691343] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.691577] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.691853] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:23.058796] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:23.141280] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:23.141326] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:23.144418] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:26.767415] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:26.846945] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:26.847030] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:26.847818] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:26.883554] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:26.962804] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:26.962905] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:26.963380] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:27:26.963415] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:27.000197] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:27.077981] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:27.078029] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:27.078729] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.079174] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.079448] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.079873] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.080105] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.080429] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:42.152150] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:42.233109] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:42.233149] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:42.233818] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:42.269022] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:42.347310] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:42.347347] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:42.347771] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:27:42.347798] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:42.383260] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:42.460650] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:42.460741] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:42.461348] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.461780] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.462085] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.462493] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.462711] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.462990] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:57.534379] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:57.614068] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:57.614115] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:57.614835] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:57.650436] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:57.729556] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:57.729652] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:57.730164] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:27:57.730204] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:57.768743] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:57.848062] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:57.848153] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:57.848830] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.849277] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.849554] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.849983] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.850217] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.850522] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:28:12.934596] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:28:13.015126] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:28:13.015167] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:28:13.015948] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:28:13.054894] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 - glusterd.log [2020-04-07 13:25:22.884457] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore The message "I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req" repeated 7 times between [2020-04-07 13:24:06.750320] and [2020-04-07 13:25:54.449004] The message "I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req" repeated 39 times between [2020-04-07 13:24:06.989378] and [2020-04-07 13:25:54.688025] [2020-04-07 13:26:09.832058] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2020-04-07 13:26:10.074287] I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2020-04-07 13:26:23.017801] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big [2020-04-07 13:26:23.018753] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2020-04-07 13:26:23.019190] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine [2020-04-07 13:26:23.019628] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore [2020-04-07 13:27:23.143376] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2020-04-07 13:27:23.143718] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine [2020-04-07 13:27:23.142636] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big [2020-04-07 13:27:23.144070] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore The message "I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req" repeated 7 times between [2020-04-07 13:26:09.832058] and [2020-04-07 13:27:57.614518] The message "I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req" repeated 39 times between [2020-04-07 13:26:10.074287] and [2020-04-07 13:27:57.850194] [2020-04-07 13:28:13.015558] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2020-04-07 13:28:13.256713] I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2020-04-07 13:28:23.264138] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big [2020-04-07 13:28:23.264884] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2020-04-07 13:28:23.265316] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine [2020-04-07 13:28:23.265644] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore [2020-04-07 13:29:23.389341] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2020-04-07 13:29:23.389702] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine [2020-04-07 13:29:23.388761] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big [2020-04-07 13:29:23.390058] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore The message "I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req" repeated 6 times between [2020-04-07 13:28:13.015558] and [2020-04-07 13:29:45.291552] The message "I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req" repeated 34 times between [2020-04-07 13:28:13.256713] and [2020-04-07 13:29:45.524677] [2020-04-07 13:30:00.664512] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2020-04-07 13:30:00.899102] I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2020-04-07 13:30:23.505881] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big - events.log [2020-04-07 15:21:30,978] INFO [utils - 66:log_event] - {'event': 'CLIENT_CONNECT', 'message': {'client_identifier': '10.50.50.11:49135', 'server_identifier': '10.50.50.11:49153', 'client_uid': 'CTX_ID:815a9519-d88e-402d-9473-19015cca1db8-GRAPH_ID:0-PID:131714-HOST:ovirt.mydomain.local-PC_NAME:engine-client-0-RECON_NO:-0', 'brick_path': '/gluster_bricks/engine/engine', 'subdir_mount': '(null)'}, 'nodeid': 'ea83cff6-4cc6-4399-a096-5892476721f0', 'ts': 1586265690} [2020-04-07 15:21:30,980] INFO [utils - 66:log_event] - {'event': 'CLIENT_DISCONNECT', 'message': {'client_identifier': '10.50.50.11:49135', 'server_identifier': '10.50.50.11:49153', 'client_uid': 'CTX_ID:815a9519-d88e-402d-9473-19015cca1db8-GRAPH_ID:0-PID:131714-HOST:ovirt.mydomain.local-PC_NAME:engine-client-0-RECON_NO:-0', 'brick_path': '/gluster_bricks/engine/engine'}, 'nodeid': 'ea83cff6-4cc6-4399-a096-5892476721f0', 'ts': 1586265690} [2020-04-07 15:36:33,598] INFO [utils - 66:log_event] - {'event': 'CLIENT_CONNECT', 'message': {'client_identifier': '10.50.50.11:49144', 'server_identifier': '10.50.50.11:49155', 'client_uid': 'CTX_ID:e1d09577-50d6-4679-b0e8-0fe554f9fd49-GRAPH_ID:0-PID:139160-HOST:ovirt.mydomain.local-PC_NAME:big-client-0-RECON_NO:-0', 'brick_path': '/gluster_bricks/big/big', 'subdir_mount': '(null)'}, 'nodeid': 'ea83cff6-4cc6-4399-a096-5892476721f0', 'ts': 1586266593} [2020-04-07 15:36:33,599] INFO [utils - 66:log_event] - {'event': 'CLIENT_DISCONNECT', 'message': {'client_identifier': '10.50.50.11:49144', 'server_identifier': '10.50.50.11:49155', 'client_uid': 'CTX_ID:e1d09577-50d6-4679-b0e8-0fe554f9fd49-GRAPH_ID:0-PID:139160-HOST:ovirt.mydomain.local-PC_NAME:big-client-0-RECON_NO:-0', 'brick_path': '/gluster_bricks/big/big'}, 'nodeid': 'ea83cff6-4cc6-4399-a096-5892476721f0', 'ts': 1586266593} - rhev-data-center-mnt-glusterSD-ovirtst.mydomain.storage:_vmstore.log [2020-04-07 13:24:16.070237] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4197133: WRITE => -1 gfid=aa68cd63-f3d3-47bd-a177-c7f48bef3c59 fd=0x7f3c5019f348 (Invalid argument) [2020-04-07 13:24:16.074521] I [MSGID: 133022] [shard.c:3674:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=aa68cd63-f3d3-47bd-a177-c7f48bef3c59 from backend [2020-04-07 13:27:28.721262] E [MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on shard 523 failed. Base file gfid = d22530cf-2e50-4059-8924-0aafe38497b1 [No such file or directory] [2020-04-07 13:27:28.721432] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4435189: WRITE => -1 gfid=d22530cf-2e50-4059-8924-0aafe38497b1 fd=0x7f3c4c07ab38 (No such file or directory) [2020-04-07 13:29:16.232630] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-vmstore-client-0: remote operation failed [Invalid argument] [2020-04-07 13:29:16.232675] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4496167: WRITE => -1 gfid=54bafbb5-edd6-42a0-aaba-a5295031be17 fd=0x7f3c48125838 (Invalid argument) [2020-04-07 13:29:16.236352] I [MSGID: 133022] [shard.c:3674:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=54bafbb5-edd6-42a0-aaba-a5295031be17 from backend [2020-04-07 13:34:16.378326] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-vmstore-client-0: remote operation failed [Invalid argument] [2020-04-07 13:34:16.378461] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4698415: WRITE => -1 gfid=6beaa642-e73e-40c7-8adc-64275528b44f fd=0x7f3c48138cc8 (Invalid argument) [2020-04-07 13:34:16.382824] I [MSGID: 133022] [shard.c:3674:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=6beaa642-e73e-40c7-8adc-64275528b44f from backend [2020-04-07 13:39:16.546919] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-vmstore-client-0: remote operation failed [Invalid argument] [2020-04-07 13:39:16.547022] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4884227: WRITE => -1 gfid=f32843b7-4131-4536-abe2-f6969aa455ae fd=0x7f3c4813fb58 (Invalid argument) [2020-04-07 13:39:16.550630] I [MSGID: 133022] [shard.c:3674:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=f32843b7-4131-4536-abe2-f6969aa455ae from backend - gluster_bricks-vmstore-vmstore.log [2020-04-07 13:21:28.128381] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-vmstore-server: disconnecting connection from CTX_ID:12a5f3 7d-94d3-4b22-aaf6-4ac5bb7567d6-GRAPH_ID:0-PID:131656-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0 [2020-04-07 13:21:28.128565] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-vmstore-server: Shutting down connection CTX_ID:12a5f37d-94 d3-4b22-aaf6-4ac5bb7567d6-GRAPH_ID:0-PID:131656-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0 [2020-04-07 13:24:16.069408] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid arg ument] [2020-04-07 13:24:16.070018] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-vmstore-server: 3528796: WRITEV 18 (aa68cd63- f3d3-47bd-a177-c7f48bef3c59), client: CTX_ID:20665547-59fd-4b4a-94fb-7fdfdca9f5a9-GRAPH_ID:0-PID:17102-HOST:ovirt.mydomain.local-PC_NAME:vmsto re-client-0-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-04-07 13:27:28.719391] W [MSGID: 113020] [posix-helpers.c:1051:posix_gfid_set] 0-vmstore-posix: setting GFID on /gluster_bricks/vmstore /vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] [2020-04-07 13:27:28.719978] E [MSGID: 113020] [posix-entry-ops.c:517:posix_mknod] 0-vmstore-posix: setting gfid on /gluster_bricks/vmstore/v mstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] [2020-04-07 13:29:16.232263] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument] [2020-04-07 13:29:16.232578] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-vmstore-server: 3777485: WRITEV 12 (54bafbb5-edd6-42a0-aaba-a5295031be17), client: CTX_ID:20665547-59fd-4b4a-94fb-7fdfdca9f5a9-GRAPH_ID:0-PID:17102-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-04-07 13:34:16.377622] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument] - messages Apr 7 15:25:26 ovirt gluster_bricks-big-big[39632]: [2020-04-07 13:25:26.789943] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-big-posix: write failed: offset 0, [Invalid argument] Apr 7 15:25:26 ovirt gluster_bricks-big-big[39632]: [2020-04-07 13:25:26.790284] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev _cbk] 0-big-server: 20535: WRITEV 1 (e965b2e9-500b-4520-a17e-d00b93a2e1bf), client: CTX_ID:2530ad70-1cec-43a2-bdf8-8c5a085fecb4-GRAPH_ID:0-PI D:41137-HOST:ovirt.mydomain.local-PC_NAME:big-client-0-RECON_NO:-0, error-xlator: big-posix [Invalid argument] Apr 7 15:26:16 ovirt gluster_bricks-engine-engine[10652]: [2020-04-07 13:26:16.183229] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_wr itev] 0-engine-posix: write failed: offset 0, [Invalid argument] Apr 7 15:26:16 ovirt gluster_bricks-engine-engine[10652]: [2020-04-07 13:26:16.183318] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_ writev_cbk] 0-engine-server: 359871: WRITEV 5 (cb0ea391-ed2f-4318-91d3-b8e18f6bdfb7), client: CTX_ID:08f296e3-beeb-4658-8e79-d9add812c3d7-GRA PH_ID:0-PID:11400-HOST:ovirt.mydomain.local-PC_NAME:engine-client-0-RECON_NO:-0, error-xlator: engine-posix [Invalid argument] Apr 7 15:27:28 ovirt gluster_bricks-vmstore-vmstore[10678]: [2020-04-07 13:27:28.719391] W [MSGID: 113020] [posix-helpers.c:1051:posix_gfid_set] 0-vmstore-posix: setting GFID on /gluster_bricks/vmstore/vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] Apr 7 15:27:28 ovirt gluster_bricks-vmstore-vmstore[10678]: [2020-04-07 13:27:28.719978] E [MSGID: 113020] [posix-entry-ops.c:517:posix_mknod] 0-vmstore-posix: setting gfid on /gluster_bricks/vmstore/vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] Apr 7 15:29:16 ovirt gluster_bricks-data-data[10609]: [2020-04-07 13:29:16.232149] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-data-posix: write failed: offset 0, [Invalid argument]

On Tue, Apr 7, 2020 at 3:59 PM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
OK. So I set log at least at INFO level on all subsystems and tried a redeploy of Openshift with 3 mater nodes and 7 worker nodes. One worker got the error and VM in paused mode
Apr 7, 2020, 3:27:28 PM VM worker-6 has been paused due to unknown storage error.
The vm has only one 100Gb virtual disk on gluster volume named vmstore
Here below all the logs around time at the different layers. Let me know if you need another log file not yet considered.
From what I see, the matching error is found in
- rhev-data-center-mnt-glusterSD-ovirtst.mydomain.storage:_vmstore.log
[2020-04-07 13:27:28.721262] E [MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on shard 523 failed. Base file gfid = d22530cf-2e50-4059-8924-0aafe38497b1 [No such file or directory] [2020-04-07 13:27:28.721432] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4435189: WRITE => -1 gfid=d22530cf-2e50-4059-8924-0aafe38497b1 fd=0x7f3c4c07ab38 (No such file or directory)
and
- gluster_bricks-vmstore-vmstore.log
[2020-04-07 13:27:28.719391] W [MSGID: 113020] [posix-helpers.c:1051:posix_gfid_set] 0-vmstore-posix: setting GFID on /gluster_bricks/vmstore /vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] [2020-04-07 13:27:28.719978] E [MSGID: 113020] [posix-entry-ops.c:517:posix_mknod] 0-vmstore-posix: setting gfid on /gluster_bricks/vmstore/v mstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists]
Here below all the files checked. Any hint?
Gianluca
During sort of stress installation test of OCP worker nodes, the same happened other two times with different VMs, always with thin provisioned disks and with the same lookup error in log: errors on rhev-data-center-mnt-glusterSD-ovirtst.mydomain.storage:_vmstore.log [2020-04-07 14:38:55.505093] E [MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on shard 732 failed. Base file gfid = d99c2f26-ec99-43a9-8a5b-887c38220e1a [No such file or directory] [2020-04-07 14:38:55.505455] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 7219169: WRITE => -1 gfid=d99c2f26ec99-43a9-8a5b-887c38220e1a fd=0x7f3c4c09aa48 (No such file or directory) [2020-04-07 14:38:55.505461] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 7219175: WRITE => -1 gfid=d99c2f26-ec99-43a9-8a5b-887c38220e1a fd=0x7f3c4c09aa48 (No such file or directory) [2020-04-07 14:38:55.505432] E [MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on shard 732 failed. Base file gfid = d99c2f26-ec99-43a9-8a5b-887c38220e1a [No such file or directory] [2020-04-07 14:39:18.292135] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-vmstore-client-0: remote operation failed [Invalid argument] and [2020-04-07 15:31:02.224194] E [MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on shard 1363 failed. Base file gfid = 0bee6cf6-da1d-4a37-8afb-3459815986f5 [No such file or directory] [2020-04-07 15:31:02.224393] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 9662285: WRITE => -1 gfid=0bee6cf6-da1d-4a37-8afb-3459815986f5 fd=0x7f3c4c024178 (No such file or directory) Gianluca

On April 7, 2020 4:59:03 PM GMT+03:00, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
OK. So I set log at least at INFO level on all subsystems and tried a redeploy of Openshift with 3 mater nodes and 7 worker nodes. One worker got the error and VM in paused mode
Apr 7, 2020, 3:27:28 PM VM worker-6 has been paused due to unknown storage error.
The vm has only one 100Gb virtual disk on gluster volume named vmstore
Here below all the logs around time at the different layers. Let me know if you need another log file not yet considered.
From what I see, the matching error is found in
- rhev-data-center-mnt-glusterSD-ovirtst.mydomain.storage:_vmstore.log
[2020-04-07 13:27:28.721262] E [MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on shard 523 failed. Base file gfid = d22530cf-2e50-4059-8924-0aafe38497b1 [No such file or directory] [2020-04-07 13:27:28.721432] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4435189: WRITE => -1 gfid=d22530cf-2e50-4059-8924-0aafe38497b1 fd=0x7f3c4c07ab38 (No such file or directory)
and
- gluster_bricks-vmstore-vmstore.log
[2020-04-07 13:27:28.719391] W [MSGID: 113020] [posix-helpers.c:1051:posix_gfid_set] 0-vmstore-posix: setting GFID on /gluster_bricks/vmstore /vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] [2020-04-07 13:27:28.719978] E [MSGID: 113020] [posix-entry-ops.c:517:posix_mknod] 0-vmstore-posix: setting gfid on /gluster_bricks/vmstore/v mstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists]
Here below all the files checked. Any hint?
Gianluca
- qemu.log of the vm
2020-04-07T12:30:29.954084Z qemu-kvm: -drive file=/rhev/data-center/mnt/glusterSD/ovirtst.mydomain.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/04ec5a8d-4ee6-4661-a832-0c094968aa5b/716fdc98-2d0e-44c3-b5fe-c0187cdad751,format=raw,if=none,id=drive-ua-04ec5a8d-4ee6-4661-a832-0c094968aa5b,serial=04ec5a8d-4ee6-4661-a832-0c094968aa5b,werror=stop,rerror=stop,cache=none,aio=threads: 'serial' is deprecated, please use the corresponding option of '-device' instead Spice-Message: 14:30:29.963: setting TLS option 'CipherString' to 'kECDHE+FIPS:kDHE+FIPS:kRSA+FIPS:!eNULL:!aNULL' from /etc/pki/tls/spice.cnf configuration file 2020-04-07T12:30:29.976216Z qemu-kvm: warning: CPU(s) not present in any NUMA nodes: CPU 8 [socket-id: 2, core-id: 0, thread-id: 0], CPU 9 [socket-id: 2, core-id: 1, thread-id: 0], CPU 10 [socket-id: 2, core-id: 2, thread-id: 0], CPU 11 [socket-id: 2, core-id: 3, thread-id: 0], CPU 12 [socket-id: 3, core-id: 0, thread-id: 0], CPU 13 [socket-id: 3, core-id: 1, thread-id: 0], CPU 14 [socket-id: 3, core-id: 2, thread-id: 0], CPU 15 [socket-id: 3, core-id: 3, thread-id: 0], CPU 16 [socket-id: 4, core-id: 0, thread-id: 0], CPU 17 [socket-id: 4, core-id: 1, thread-id: 0], CPU 18 [socket-id: 4, core-id: 2, thread-id: 0], CPU 19 [socket-id: 4, core-id: 3, thread-id: 0], CPU 20 [socket-id: 5, core-id: 0, thread-id: 0], CPU 21 [socket-id: 5, core-id: 1, thread-id: 0], CPU 22 [socket-id: 5, core-id: 2, thread-id: 0], CPU 23 [socket-id: 5, core-id: 3, thread-id: 0], CPU 24 [socket-id: 6, core-id: 0, thread-id: 0], CPU 25 [socket-id: 6, core-id: 1, thread-id: 0], CPU 26 [socket-id: 6, core-id: 2, thread-id: 0], CPU 27 [socket-id: 6, core-id: 3, thread-id: 0], CPU 28 [socket-id: 7, core-id: 0, thread-id: 0], CPU 29 [socket-id: 7, core-id: 1, thread-id: 0], CPU 30 [socket-id: 7, core-id: 2, thread-id: 0], CPU 31 [socket-id: 7, core-id: 3, thread-id: 0], CPU 32 [socket-id: 8, core-id: 0, thread-id: 0], CPU 33 [socket-id: 8, core-id: 1, thread-id: 0], CPU 34 [socket-id: 8, core-id: 2, thread-id: 0], CPU 35 [socket-id: 8, core-id: 3, thread-id: 0], CPU 36 [socket-id: 9, core-id: 0, thread-id: 0], CPU 37 [socket-id: 9, core-id: 1, thread-id: 0], CPU 38 [socket-id: 9, core-id: 2, thread-id: 0], CPU 39 [socket-id: 9, core-id: 3, thread-id: 0], CPU 40 [socket-id: 10, core-id: 0, thread-id: 0], CPU 41 [socket-id: 10, core-id: 1, thread-id: 0], CPU 42 [socket-id: 10, core-id: 2, thread-id: 0], CPU 43 [socket-id: 10, core-id: 3, thread-id: 0], CPU 44 [socket-id: 11, core-id: 0, thread-id: 0], CPU 45 [socket-id: 11, core-id: 1, thread-id: 0], CPU 46 [socket-id: 11, core-id: 2, thread-id: 0], CPU 47 [socket-id: 11, core-id: 3, thread-id: 0], CPU 48 [socket-id: 12, core-id: 0, thread-id: 0], CPU 49 [socket-id: 12, core-id: 1, thread-id: 0], CPU 50 [socket-id: 12, core-id: 2, thread-id: 0], CPU 51 [socket-id: 12, core-id: 3, thread-id: 0], CPU 52 [socket-id: 13, core-id: 0, thread-id: 0], CPU 53 [socket-id: 13, core-id: 1, thread-id: 0], CPU 54 [socket-id: 13, core-id: 2, thread-id: 0], CPU 55 [socket-id: 13, core-id: 3, thread-id: 0], CPU 56 [socket-id: 14, core-id: 0, thread-id: 0], CPU 57 [socket-id: 14, core-id: 1, thread-id: 0], CPU 58 [socket-id: 14, core-id: 2, thread-id: 0], CPU 59 [socket-id: 14, core-id: 3, thread-id: 0], CPU 60 [socket-id: 15, core-id: 0, thread-id: 0], CPU 61 [socket-id: 15, core-id: 1, thread-id: 0], CPU 62 [socket-id: 15, core-id: 2, thread-id: 0], CPU 63 [socket-id: 15, core-id: 3, thread-id: 0] 2020-04-07T12:30:29.976247Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future main_channel_link: add main channel client main_channel_client_handle_pong: net test: latency 52.540000 ms, bitrate 6113597 bps (5.830380 Mbps) LOW BANDWIDTH inputs_connect: inputs channel client create red_qxl_set_cursor_peer: inputs_channel_detach_tablet: red_channel_client_disconnect: rcc=0x5626c717a9c0 (channel=0x5626c616a880 type=3 id=0) red_channel_client_disconnect: rcc=0x5626c8351dd0 (channel=0x5626c7bd38d0 type=4 id=0) red_channel_client_disconnect: rcc=0x5626c834c5b0 (channel=0x5626c616a230 type=2 id=0) red_channel_client_disconnect: rcc=0x5626c719e9f0 (channel=0x5626c616a160 type=1 id=0) main_channel_client_on_disconnect: rcc=0x5626c719e9f0 red_client_destroy: destroy client 0x5626c6195570 with #channels=4 red_qxl_disconnect_cursor_peer: red_qxl_disconnect_display_peer: inputs_channel_detach_tablet: inputs_channel_detach_tablet: main_channel_link: add main channel client main_channel_client_handle_pong: net test: latency 126.308000 ms, bitrate 9060022 bps (8.640310 Mbps) LOW BANDWIDTH red_qxl_set_cursor_peer: inputs_connect: inputs channel client create red_channel_client_disconnect: rcc=0x5626c717a9f0 (channel=0x5626c616a160 type=1 id=0) main_channel_client_on_disconnect: rcc=0x5626c717a9f0 red_client_destroy: destroy client 0x5626c7ab6870 with #channels=4 red_channel_client_disconnect: rcc=0x5626c71529c0 (channel=0x5626c616a880 type=3 id=0) red_qxl_disconnect_cursor_peer: red_channel_client_disconnect: rcc=0x5626c7e59dd0 (channel=0x5626c7bd38d0 type=4 id=0) red_qxl_disconnect_display_peer: red_channel_client_disconnect: rcc=0x5626c7e5e5b0 (channel=0x5626c616a230 type=2 id=0) [root@ovirt qemu]#
- /var/log/glusterfs/cli.log
[2020-04-07 13:25:39.305818] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:39.306091] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:39.306531] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:39.306758] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:39.307081] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:25:54.365843] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:25:54.448475] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:25:54.448600] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:25:54.449348] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:25:54.485288] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:25:54.566723] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:25:54.566792] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:25:54.567275] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:25:54.567305] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:25:54.603618] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:25:54.685718] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:25:54.685821] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:25:54.686537] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.686994] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.687289] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.687774] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.688061] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.688368] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:09.748782] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:09.831541] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:09.831610] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:09.832596] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:09.871269] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:09.955626] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:09.955726] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:09.956239] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:26:09.956284] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:09.993929] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:10.073634] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:10.073680] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:10.074454] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.074932] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.075241] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.075677] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.075920] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.076233] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:22.933923] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:23.016382] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:23.016486] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:23.020067] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:25.149347] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:25.229667] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:25.229716] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:25.230491] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:25.266853] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:25.348774] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:25.348815] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:25.349253] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:26:25.349284] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:25.385951] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:25.468521] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:25.468571] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:25.469286] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.469703] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.469965] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.470381] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.470595] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.470877] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:40.546233] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:40.629567] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:40.629667] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:40.630529] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:40.666769] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:40.746861] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:40.746934] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:40.747436] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:26:40.747474] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:40.789486] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:40.881750] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:40.881871] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:40.882615] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.883142] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.883532] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.884020] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.884283] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.884615] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:55.966643] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:56.049639] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:56.049735] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:56.050543] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:56.096892] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:56.182275] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:56.182335] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:56.182883] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:26:56.182918] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:56.221093] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:56.303302] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:56.303337] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:56.303996] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.304432] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.304706] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.305135] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.305359] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.305645] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:11.378699] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:11.457630] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:11.457697] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:11.458355] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:11.493643] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:11.571358] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:11.571360] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:11.572137] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:27:11.572167] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:11.608432] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:11.689353] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:11.689437] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:11.690117] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.690569] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.690910] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.691343] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.691577] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.691853] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:23.058796] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:23.141280] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:23.141326] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:23.144418] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:26.767415] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:26.846945] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:26.847030] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:26.847818] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:26.883554] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:26.962804] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:26.962905] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:26.963380] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:27:26.963415] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:27.000197] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:27.077981] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:27.078029] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:27.078729] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.079174] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.079448] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.079873] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.080105] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.080429] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:42.152150] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:42.233109] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:42.233149] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:42.233818] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:42.269022] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:42.347310] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:42.347347] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:42.347771] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:27:42.347798] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:42.383260] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:42.460650] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:42.460741] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:42.461348] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.461780] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.462085] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.462493] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.462711] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.462990] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:57.534379] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:57.614068] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:57.614115] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:57.614835] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:57.650436] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:57.729556] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:57.729652] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:57.730164] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:27:57.730204] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:57.768743] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:57.848062] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:57.848153] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:57.848830] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.849277] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.849554] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.849983] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.850217] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.850522] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:28:12.934596] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:28:13.015126] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:28:13.015167] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:28:13.015948] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:28:13.054894] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8
- glusterd.log
[2020-04-07 13:25:22.884457] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore The message "I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req" repeated 7 times between [2020-04-07 13:24:06.750320] and [2020-04-07 13:25:54.449004] The message "I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req" repeated 39 times between [2020-04-07 13:24:06.989378] and [2020-04-07 13:25:54.688025] [2020-04-07 13:26:09.832058] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2020-04-07 13:26:10.074287] I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2020-04-07 13:26:23.017801] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big [2020-04-07 13:26:23.018753] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2020-04-07 13:26:23.019190] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine [2020-04-07 13:26:23.019628] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore [2020-04-07 13:27:23.143376] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2020-04-07 13:27:23.143718] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine [2020-04-07 13:27:23.142636] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big [2020-04-07 13:27:23.144070] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore The message "I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req" repeated 7 times between [2020-04-07 13:26:09.832058] and [2020-04-07 13:27:57.614518] The message "I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req" repeated 39 times between [2020-04-07 13:26:10.074287] and [2020-04-07 13:27:57.850194] [2020-04-07 13:28:13.015558] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2020-04-07 13:28:13.256713] I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2020-04-07 13:28:23.264138] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big [2020-04-07 13:28:23.264884] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2020-04-07 13:28:23.265316] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine [2020-04-07 13:28:23.265644] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore [2020-04-07 13:29:23.389341] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2020-04-07 13:29:23.389702] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine [2020-04-07 13:29:23.388761] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big [2020-04-07 13:29:23.390058] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore The message "I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req" repeated 6 times between [2020-04-07 13:28:13.015558] and [2020-04-07 13:29:45.291552] The message "I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req" repeated 34 times between [2020-04-07 13:28:13.256713] and [2020-04-07 13:29:45.524677] [2020-04-07 13:30:00.664512] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2020-04-07 13:30:00.899102] I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2020-04-07 13:30:23.505881] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big
- events.log
[2020-04-07 15:21:30,978] INFO [utils - 66:log_event] - {'event': 'CLIENT_CONNECT', 'message': {'client_identifier': '10.50.50.11:49135', 'server_identifier': '10.50.50.11:49153', 'client_uid': 'CTX_ID:815a9519-d88e-402d-9473-19015cca1db8-GRAPH_ID:0-PID:131714-HOST:ovirt.mydomain.local-PC_NAME:engine-client-0-RECON_NO:-0', 'brick_path': '/gluster_bricks/engine/engine', 'subdir_mount': '(null)'}, 'nodeid': 'ea83cff6-4cc6-4399-a096-5892476721f0', 'ts': 1586265690} [2020-04-07 15:21:30,980] INFO [utils - 66:log_event] - {'event': 'CLIENT_DISCONNECT', 'message': {'client_identifier': '10.50.50.11:49135', 'server_identifier': '10.50.50.11:49153', 'client_uid': 'CTX_ID:815a9519-d88e-402d-9473-19015cca1db8-GRAPH_ID:0-PID:131714-HOST:ovirt.mydomain.local-PC_NAME:engine-client-0-RECON_NO:-0', 'brick_path': '/gluster_bricks/engine/engine'}, 'nodeid': 'ea83cff6-4cc6-4399-a096-5892476721f0', 'ts': 1586265690} [2020-04-07 15:36:33,598] INFO [utils - 66:log_event] - {'event': 'CLIENT_CONNECT', 'message': {'client_identifier': '10.50.50.11:49144', 'server_identifier': '10.50.50.11:49155', 'client_uid': 'CTX_ID:e1d09577-50d6-4679-b0e8-0fe554f9fd49-GRAPH_ID:0-PID:139160-HOST:ovirt.mydomain.local-PC_NAME:big-client-0-RECON_NO:-0', 'brick_path': '/gluster_bricks/big/big', 'subdir_mount': '(null)'}, 'nodeid': 'ea83cff6-4cc6-4399-a096-5892476721f0', 'ts': 1586266593} [2020-04-07 15:36:33,599] INFO [utils - 66:log_event] - {'event': 'CLIENT_DISCONNECT', 'message': {'client_identifier': '10.50.50.11:49144', 'server_identifier': '10.50.50.11:49155', 'client_uid': 'CTX_ID:e1d09577-50d6-4679-b0e8-0fe554f9fd49-GRAPH_ID:0-PID:139160-HOST:ovirt.mydomain.local-PC_NAME:big-client-0-RECON_NO:-0', 'brick_path': '/gluster_bricks/big/big'}, 'nodeid': 'ea83cff6-4cc6-4399-a096-5892476721f0', 'ts': 1586266593}
- rhev-data-center-mnt-glusterSD-ovirtst.mydomain.storage:_vmstore.log
[2020-04-07 13:24:16.070237] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4197133: WRITE => -1 gfid=aa68cd63-f3d3-47bd-a177-c7f48bef3c59 fd=0x7f3c5019f348 (Invalid argument) [2020-04-07 13:24:16.074521] I [MSGID: 133022] [shard.c:3674:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=aa68cd63-f3d3-47bd-a177-c7f48bef3c59 from backend [2020-04-07 13:27:28.721262] E [MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on shard 523 failed. Base file gfid = d22530cf-2e50-4059-8924-0aafe38497b1 [No such file or directory] [2020-04-07 13:27:28.721432] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4435189: WRITE => -1 gfid=d22530cf-2e50-4059-8924-0aafe38497b1 fd=0x7f3c4c07ab38 (No such file or directory) [2020-04-07 13:29:16.232630] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-vmstore-client-0: remote operation failed [Invalid argument] [2020-04-07 13:29:16.232675] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4496167: WRITE => -1 gfid=54bafbb5-edd6-42a0-aaba-a5295031be17 fd=0x7f3c48125838 (Invalid argument) [2020-04-07 13:29:16.236352] I [MSGID: 133022] [shard.c:3674:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=54bafbb5-edd6-42a0-aaba-a5295031be17 from backend [2020-04-07 13:34:16.378326] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-vmstore-client-0: remote operation failed [Invalid argument] [2020-04-07 13:34:16.378461] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4698415: WRITE => -1 gfid=6beaa642-e73e-40c7-8adc-64275528b44f fd=0x7f3c48138cc8 (Invalid argument) [2020-04-07 13:34:16.382824] I [MSGID: 133022] [shard.c:3674:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=6beaa642-e73e-40c7-8adc-64275528b44f from backend [2020-04-07 13:39:16.546919] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-vmstore-client-0: remote operation failed [Invalid argument] [2020-04-07 13:39:16.547022] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4884227: WRITE => -1 gfid=f32843b7-4131-4536-abe2-f6969aa455ae fd=0x7f3c4813fb58 (Invalid argument) [2020-04-07 13:39:16.550630] I [MSGID: 133022] [shard.c:3674:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=f32843b7-4131-4536-abe2-f6969aa455ae from backend
- gluster_bricks-vmstore-vmstore.log
[2020-04-07 13:21:28.128381] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-vmstore-server: disconnecting connection from CTX_ID:12a5f3 7d-94d3-4b22-aaf6-4ac5bb7567d6-GRAPH_ID:0-PID:131656-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0 [2020-04-07 13:21:28.128565] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-vmstore-server: Shutting down connection CTX_ID:12a5f37d-94 d3-4b22-aaf6-4ac5bb7567d6-GRAPH_ID:0-PID:131656-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0 [2020-04-07 13:24:16.069408] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid arg ument] [2020-04-07 13:24:16.070018] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-vmstore-server: 3528796: WRITEV 18 (aa68cd63- f3d3-47bd-a177-c7f48bef3c59), client: CTX_ID:20665547-59fd-4b4a-94fb-7fdfdca9f5a9-GRAPH_ID:0-PID:17102-HOST:ovirt.mydomain.local-PC_NAME:vmsto re-client-0-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-04-07 13:27:28.719391] W [MSGID: 113020] [posix-helpers.c:1051:posix_gfid_set] 0-vmstore-posix: setting GFID on /gluster_bricks/vmstore /vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] [2020-04-07 13:27:28.719978] E [MSGID: 113020] [posix-entry-ops.c:517:posix_mknod] 0-vmstore-posix: setting gfid on /gluster_bricks/vmstore/v mstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] [2020-04-07 13:29:16.232263] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument] [2020-04-07 13:29:16.232578] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-vmstore-server: 3777485: WRITEV 12 (54bafbb5-edd6-42a0-aaba-a5295031be17), client: CTX_ID:20665547-59fd-4b4a-94fb-7fdfdca9f5a9-GRAPH_ID:0-PID:17102-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-04-07 13:34:16.377622] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument]
- messages
Apr 7 15:25:26 ovirt gluster_bricks-big-big[39632]: [2020-04-07 13:25:26.789943] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-big-posix: write failed: offset 0, [Invalid argument] Apr 7 15:25:26 ovirt gluster_bricks-big-big[39632]: [2020-04-07 13:25:26.790284] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev _cbk] 0-big-server: 20535: WRITEV 1 (e965b2e9-500b-4520-a17e-d00b93a2e1bf), client: CTX_ID:2530ad70-1cec-43a2-bdf8-8c5a085fecb4-GRAPH_ID:0-PI D:41137-HOST:ovirt.mydomain.local-PC_NAME:big-client-0-RECON_NO:-0, error-xlator: big-posix [Invalid argument] Apr 7 15:26:16 ovirt gluster_bricks-engine-engine[10652]: [2020-04-07 13:26:16.183229] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_wr itev] 0-engine-posix: write failed: offset 0, [Invalid argument] Apr 7 15:26:16 ovirt gluster_bricks-engine-engine[10652]: [2020-04-07 13:26:16.183318] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_ writev_cbk] 0-engine-server: 359871: WRITEV 5 (cb0ea391-ed2f-4318-91d3-b8e18f6bdfb7), client: CTX_ID:08f296e3-beeb-4658-8e79-d9add812c3d7-GRA PH_ID:0-PID:11400-HOST:ovirt.mydomain.local-PC_NAME:engine-client-0-RECON_NO:-0, error-xlator: engine-posix [Invalid argument] Apr 7 15:27:28 ovirt gluster_bricks-vmstore-vmstore[10678]: [2020-04-07 13:27:28.719391] W [MSGID: 113020] [posix-helpers.c:1051:posix_gfid_set] 0-vmstore-posix: setting GFID on /gluster_bricks/vmstore/vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] Apr 7 15:27:28 ovirt gluster_bricks-vmstore-vmstore[10678]: [2020-04-07 13:27:28.719978] E [MSGID: 113020] [posix-entry-ops.c:517:posix_mknod] 0-vmstore-posix: setting gfid on /gluster_bricks/vmstore/vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] Apr 7 15:29:16 ovirt gluster_bricks-data-data[10609]: [2020-04-07 13:29:16.232149] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-data-posix: write failed: offset 0, [Invalid argument]
Hi Gianluca, The positive thing is that you can reproduce the issue. I would ask you to check your gluster version and if there are any updates - update the cluster. Also check the gluster's op-version, as this limits some of the features. If there are none - enable trace logs in gluster ( https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/ht... ), start volume peofiling, reproduce the issue and then reduce the log level (it's generating a lot of logs) and stop the profiling. Once that info is collected, some of the Gluster members can check the situation. Best Regards, Strahil Nikolov

On Tue, Apr 7, 2020 at 8:16 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote: Hi Gianluca,
The positive thing is that you can reproduce the issue.
I would ask you to check your gluster version and if there are any updates - update the cluster.
I'd prefer to stick on oVirt release version of Gluster if possible Also check the gluster's op-version, as this limits some of the features.
What do you mean by thgis?
If there are none - enable trace logs in gluster ( https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/ht... ), start volume peofiling, reproduce the issue and then reduce the log level (it's generating a lot of logs) and stop the profiling.
Once that info is collected, some of the Gluster members can check the situation.
Best Regards, Strahil Nikolov
Ok. I think that the INFO level set on the different layers outlined problem somehow related to shardin. Related to this, I found no official docs on Gluster web site after 3.7 version... where is it? Only information I found was in Red Hat Gluster Storage 3.5 Administration Guide, but I would expect something more upstream... In particular in my case where I have only one host and the gluster volumes are single brick based, do you think I can try to disable sharding and verify if using new disks with it disabled and oVirt thin provisioned disks let the problem go away? Also, I found some information about sharding block size. Apparently the only supported size on Red Hat Gluster Storage is 512MB, but oVirt sets it to 64MB....? I also found a bugzilla about passing from 128MB to 64MB in oVirt 4.1.5: https://bugzilla.redhat.com/show_bug.cgi?id=1469436 Now I see that by default and so also in my environment I have: features.shard on features.shard-block-size 64MB features.shard-lru-limit 16384 features.shard-deletion-rate 100 Not clear how to cross-check for all the values above in docs.... Can I safely set features.shard off In Red Hat Gluster storage admin guide it is considered only for replicated gluster volumes (and so also ditributed-replicated...) But in my case with single host and single beick for volumes I think it doesn't give any particular benefit, isn't it? I found this interesting post about sharding and connection between image files and shard files: http://opensource-storage.blogspot.com/2016/07/de-mystifying-gluster-shards.... Gianluca

On April 8, 2020 2:43:01 PM GMT+03:00, Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
On Tue, Apr 7, 2020 at 8:16 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
Hi Gianluca,
The positive thing is that you can reproduce the issue.
I would ask you to check your gluster version and if there are any updates - update the cluster.
I'd prefer to stick on oVirt release version of Gluster if possible
This is what I ment, but if you are using Gluster v6.0 - you should update to latest version.
Also check the gluster's op-version, as this limits some of the features.
What do you mean by thgis?
Check this one: https://docs.gluster.org/en/latest/Upgrade-Guide/op_version/
If there are none - enable trace logs in gluster (
), start volume peofiling, reproduce the issue and then reduce the log level (it's generating a lot of logs) and stop the profiling.
Once that info is collected, some of the Gluster members can check
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.1/ht... the
situation.
Best Regards, Strahil Nikolov
Ok. I think that the INFO level set on the different layers outlined problem somehow related to shardin.
Related to this, I found no official docs on Gluster web site after 3.7 version... where is it? Only information I found was in Red Hat Gluster Storage 3.5 Administration Guide, but I would expect something more upstream...
Red Hat versioning is different. So far RH documentation for Gluster has never failed me.
In particular in my case where I have only one host and the gluster volumes are single brick based, do you think I can try to disable sharding and verify if using new disks with it disabled and oVirt thin provisioned disks let the problem go away?
Also, I found some information about sharding block size. Apparently the only supported size on Red Hat Gluster Storage is 512MB, but oVirt sets it to 64MB....? I also found a bugzilla about passing from 128MB to 64MB in oVirt 4.1.5: https://bugzilla.redhat.com/show_bug.cgi?id=1469436
Now I see that by default and so also in my environment I have:
features.shard on
features.shard-block-size 64MB
features.shard-lru-limit 16384
features.shard-deletion-rate 100
Not clear how to cross-check for all the values above in docs....
Can I safely set features.shard off
In Red Hat Gluster storage admin guide it is considered only for replicated gluster volumes (and so also ditributed-replicated...) But in my case with single host and single beick for volumes I think it doesn't give any particular benefit, isn't it?
NEVER DISABLE SHARDING ON A VOLUME!!! There is a very good reply from Amar Tumballi in gluster users mailing list about that. The good thing is that you can do storage migration. The only benefits of sharding are: 1. When using distributed-replicated volume, shards will spread on different bricks and thus partially increase read performance 2. When a heal is going on, the shard is locked - so there will be no disruption for the VM In both cases you cannot benefit of that. Shard size is 64 MB and this is default from Gluster , not from oVirt. Maybe there is a reason behind that large shard size, but I have no clue. In your case, you shokuld consider using VDO, as most of the VMs have almost the same system files, which will lead to data reduction at a small cost of write speed. Another option is to create a new volume and disable sharding at all.
I found this interesting post about sharding and connection between image files and shard files: http://opensource-storage.blogspot.com/2016/07/de-mystifying-gluster-shards....
Gianluca
I have one question - you have a single brick in your volume. Why do you use oVirt instead of plain KVM ? Best Regards, Strahil Nikolov

On Wed, Apr 8, 2020 at 6:00 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
On April 8, 2020 2:43:01 PM GMT+03:00, Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:
On Tue, Apr 7, 2020 at 8:16 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
Hi Gianluca,
The positive thing is that you can reproduce the issue.
I would ask you to check your gluster version and if there are any updates - update the cluster.
I'd prefer to stick on oVirt release version of Gluster if possible
This is what I ment, but if you are using Gluster v6.0 - you should update to latest version.
I deployed oVirt 4.3.9 using ovirt-node-ng iso image and using cockpit based GUI wizard for HCI one node. And on ovirt-node-ng system the version was the latest available in stable 4.3 ovirt repos: glusterfs-6.8-1.el7.x86_64 The same if I go on a plain CentOS ovirt host of another environment, fully updated, the version for gluster related packages is 6.8-1.el7 and no update proposed. So I think I'm ok with that.
Also check the gluster's op-version, as this limits some of the
features.
What do you mean by thgis?
Check this one:
https://docs.gluster.org/en/latest/Upgrade-Guide/op_version/
Ah, ok. In my case I have single host environment, so in my case is not fundamental, and no upgrade has took place yet from the original 4.3.9 deploy, I have: [root@ovirt images]# gluster volume get all cluster.op-version Option Value ------ ----- cluster.op-version 60000 [root@ovirt images]# [root@ovirt images]# gluster volume get all cluster.max-op-version Option Value ------ ----- cluster.max-op-version 60000 [root@ovirt images]#
Ok. I think that the INFO level set on the different layers outlined problem somehow related to shardin.
Related to this, I found no official docs on Gluster web site after 3.7 version... where is it? Only information I found was in Red Hat Gluster Storage 3.5 Administration Guide, but I would expect something more upstream...
Red Hat versioning is different. So far RH documentation for Gluster has never failed me.
I also take in great consideration Red Hat documentation. But at the end the product is upstream, and I don't find anything in upstream documentation related to sharding (only for old 3.7 days...). Why? Or where is the doc if it is my fault? Otherwise this lets me take attention about the stability of the feature....
In particular in my case where I have only one host and the gluster
volumes are single brick based, do you think I can try to disable sharding and verify if using new disks with it disabled and oVirt thin provisioned disks let the problem go away?
Also, I found some information about sharding block size. Apparently the only supported size on Red Hat Gluster Storage is 512MB, but oVirt sets it to 64MB....? I also found a bugzilla about passing from 128MB to 64MB in oVirt 4.1.5: https://bugzilla.redhat.com/show_bug.cgi?id=1469436
Now I see that by default and so also in my environment I have:
features.shard on
features.shard-block-size 64MB
features.shard-lru-limit 16384
features.shard-deletion-rate 100
NEVER DISABLE SHARDING ON A VOLUME!!! There is a very good reply from Amar Tumballi in gluster users mailing list about that. The good thing is that you can do storage migration.
Remember: I have only one node and only one brick per volume, so it doesn't give me any real benefit I think If I go and add another node I'll take care to modify, but it is not an immediate thing. It is a test lab
The only benefits of sharding are: 1. When using distributed-replicated volume, shards will spread on different bricks and thus partially increase read performance
Not needed in my environment 2. When a heal is going on, the shard is locked - so there will be no
disruption for the VM
No heal on my environment because it is distributed (even if actually I have one brick per volume)
In both cases you cannot benefit of that.
Shard size is 64 MB and this is default from Gluster , not from oVirt. Maybe there is a reason behind that large shard size, but I have no clue.
Where do you see that the default is 64MB? Gluster what? upstream 6.8 or at build time by the team to create ovirt-4.3-centos-gluster6 repos or what? Because if you read the Red Hat Gluster Storage documentation (that you referred as the "master") it says that 512MB is the default and apparently the only allowed in their product.... As I already linked: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/ht... " features.shard-block-size Specifies the maximum size of the file pieces when sharding is enabled. The supported value for this parameter is 512MB. " And also in the table here: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/ht... you can see that both Alowed and Default values are 512MB I don't know if the Red Hat Gluster Storage 3.5 product is based on 6.8 version or not? How to check? And, missing the upstream documentation about sharding values, I cannot compare to that.... But as the RHV 4.3 product guide directly refers to Red Hat Gluster Storage docs.... I imagine what provided by oVirt is in the same version line....
In your case, you shokuld consider using VDO, as most of the VMs have almost the same system files, which will lead to data reduction at a small cost of write speed.
OK, thanks
Another option is to create a new volume and disable sharding at all.
In fact I created another 4Tb volume and disabled sharding on it. Then today I deployed again the Openshift OCP installation using 3 Master Nodes + 7 Worker Nodes, each with a 120Gb disk (thin provisioned) and the installation completed without any storage unknown error. Before I tried 3 times the same installation and every time I got at least 4-5 nodes that during the installation phase with ignition process from bootstrap node went to paused.... This + the gfid errors I provided seem to confirm my idea.
I have one question - you have a single brick in your volume. Why do you use oVirt instead of plain KVM ?
In the future it can grow to more than one node. And also I wante to use / test a supported one node configuration for oVirt. And also many ansible scripts used take advantage of ovirt modules: https://docs.ansible.com/ansible/latest/modules/ovirt_module.html
Best Regards, Strahil Nikolov
Cheers, Gianluca

On Tue, Apr 7, 2020 at 7:36 PM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
OK. So I set log at least at INFO level on all subsystems and tried a redeploy of Openshift with 3 mater nodes and 7 worker nodes. One worker got the error and VM in paused mode
Apr 7, 2020, 3:27:28 PM VM worker-6 has been paused due to unknown storage error.
The vm has only one 100Gb virtual disk on gluster volume named vmstore
Here below all the logs around time at the different layers. Let me know if you need another log file not yet considered.
From what I see, the matching error is found in
- rhev-data-center-mnt-glusterSD-ovirtst.mydomain.storage:_vmstore.log
[2020-04-07 13:27:28.721262] E [MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on shard 523 failed. Base file gfid = d22530cf-2e50-4059-8924-0aafe38497b1 [No such file or directory] [2020-04-07 13:27:28.721432] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4435189: WRITE => -1 gfid=d22530cf-2e50-4059-8924-0aafe38497b1 fd=0x7f3c4c07ab38 (No such file or directory)
This ^^, right here is the reason the VM paused. Are you using a plain distribute volume here? Can you share some of the log messages that occur right above these errors? Also, can you check if the file $VMSTORE_BRICKPATH/.glusterfs/d2/25/d22530cf-2e50-4059-8924-0aafe38497b1 exists on the brick? -Krutika and
- gluster_bricks-vmstore-vmstore.log
[2020-04-07 13:27:28.719391] W [MSGID: 113020] [posix-helpers.c:1051:posix_gfid_set] 0-vmstore-posix: setting GFID on /gluster_bricks/vmstore /vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] [2020-04-07 13:27:28.719978] E [MSGID: 113020] [posix-entry-ops.c:517:posix_mknod] 0-vmstore-posix: setting gfid on /gluster_bricks/vmstore/v mstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists]
Here below all the files checked. Any hint?
Gianluca
- qemu.log of the vm
2020-04-07T12:30:29.954084Z qemu-kvm: -drive file=/rhev/data-center/mnt/glusterSD/ovirtst.mydomain.storage:_vmstore/81b97244-4b69-4d49-84c4-c822387adc6a/images/04ec5a8d-4ee6-4661-a832-0c094968aa5b/716fdc98-2d0e-44c3-b5fe-c0187cdad751,format=raw,if=none,id=drive-ua-04ec5a8d-4ee6-4661-a832-0c094968aa5b,serial=04ec5a8d-4ee6-4661-a832-0c094968aa5b,werror=stop,rerror=stop,cache=none,aio=threads: 'serial' is deprecated, please use the corresponding option of '-device' instead Spice-Message: 14:30:29.963: setting TLS option 'CipherString' to 'kECDHE+FIPS:kDHE+FIPS:kRSA+FIPS:!eNULL:!aNULL' from /etc/pki/tls/spice.cnf configuration file 2020-04-07T12:30:29.976216Z qemu-kvm: warning: CPU(s) not present in any NUMA nodes: CPU 8 [socket-id: 2, core-id: 0, thread-id: 0], CPU 9 [socket-id: 2, core-id: 1, thread-id: 0], CPU 10 [socket-id: 2, core-id: 2, thread-id: 0], CPU 11 [socket-id: 2, core-id: 3, thread-id: 0], CPU 12 [socket-id: 3, core-id: 0, thread-id: 0], CPU 13 [socket-id: 3, core-id: 1, thread-id: 0], CPU 14 [socket-id: 3, core-id: 2, thread-id: 0], CPU 15 [socket-id: 3, core-id: 3, thread-id: 0], CPU 16 [socket-id: 4, core-id: 0, thread-id: 0], CPU 17 [socket-id: 4, core-id: 1, thread-id: 0], CPU 18 [socket-id: 4, core-id: 2, thread-id: 0], CPU 19 [socket-id: 4, core-id: 3, thread-id: 0], CPU 20 [socket-id: 5, core-id: 0, thread-id: 0], CPU 21 [socket-id: 5, core-id: 1, thread-id: 0], CPU 22 [socket-id: 5, core-id: 2, thread-id: 0], CPU 23 [socket-id: 5, core-id: 3, thread-id: 0], CPU 24 [socket-id: 6, core-id: 0, thread-id: 0], CPU 25 [socket-id: 6, core-id: 1, thread-id: 0], CPU 26 [socket-id: 6, core-id: 2, thread-id: 0], CPU 27 [socket-id: 6, core-id: 3, thread-id: 0], CPU 28 [socket-id: 7, core-id: 0, thread-id: 0], CPU 29 [socket-id: 7, core-id: 1, thread-id: 0], CPU 30 [socket-id: 7, core-id: 2, thread-id: 0], CPU 31 [socket-id: 7, core-id: 3, thread-id: 0], CPU 32 [socket-id: 8, core-id: 0, thread-id: 0], CPU 33 [socket-id: 8, core-id: 1, thread-id: 0], CPU 34 [socket-id: 8, core-id: 2, thread-id: 0], CPU 35 [socket-id: 8, core-id: 3, thread-id: 0], CPU 36 [socket-id: 9, core-id: 0, thread-id: 0], CPU 37 [socket-id: 9, core-id: 1, thread-id: 0], CPU 38 [socket-id: 9, core-id: 2, thread-id: 0], CPU 39 [socket-id: 9, core-id: 3, thread-id: 0], CPU 40 [socket-id: 10, core-id: 0, thread-id: 0], CPU 41 [socket-id: 10, core-id: 1, thread-id: 0], CPU 42 [socket-id: 10, core-id: 2, thread-id: 0], CPU 43 [socket-id: 10, core-id: 3, thread-id: 0], CPU 44 [socket-id: 11, core-id: 0, thread-id: 0], CPU 45 [socket-id: 11, core-id: 1, thread-id: 0], CPU 46 [socket-id: 11, core-id: 2, thread-id: 0], CPU 47 [socket-id: 11, core-id: 3, thread-id: 0], CPU 48 [socket-id: 12, core-id: 0, thread-id: 0], CPU 49 [socket-id: 12, core-id: 1, thread-id: 0], CPU 50 [socket-id: 12, core-id: 2, thread-id: 0], CPU 51 [socket-id: 12, core-id: 3, thread-id: 0], CPU 52 [socket-id: 13, core-id: 0, thread-id: 0], CPU 53 [socket-id: 13, core-id: 1, thread-id: 0], CPU 54 [socket-id: 13, core-id: 2, thread-id: 0], CPU 55 [socket-id: 13, core-id: 3, thread-id: 0], CPU 56 [socket-id: 14, core-id: 0, thread-id: 0], CPU 57 [socket-id: 14, core-id: 1, thread-id: 0], CPU 58 [socket-id: 14, core-id: 2, thread-id: 0], CPU 59 [socket-id: 14, core-id: 3, thread-id: 0], CPU 60 [socket-id: 15, core-id: 0, thread-id: 0], CPU 61 [socket-id: 15, core-id: 1, thread-id: 0], CPU 62 [socket-id: 15, core-id: 2, thread-id: 0], CPU 63 [socket-id: 15, core-id: 3, thread-id: 0] 2020-04-07T12:30:29.976247Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config, ability to start up with partial NUMA mappings is obsoleted and will be removed in future main_channel_link: add main channel client main_channel_client_handle_pong: net test: latency 52.540000 ms, bitrate 6113597 bps (5.830380 Mbps) LOW BANDWIDTH inputs_connect: inputs channel client create red_qxl_set_cursor_peer: inputs_channel_detach_tablet: red_channel_client_disconnect: rcc=0x5626c717a9c0 (channel=0x5626c616a880 type=3 id=0) red_channel_client_disconnect: rcc=0x5626c8351dd0 (channel=0x5626c7bd38d0 type=4 id=0) red_channel_client_disconnect: rcc=0x5626c834c5b0 (channel=0x5626c616a230 type=2 id=0) red_channel_client_disconnect: rcc=0x5626c719e9f0 (channel=0x5626c616a160 type=1 id=0) main_channel_client_on_disconnect: rcc=0x5626c719e9f0 red_client_destroy: destroy client 0x5626c6195570 with #channels=4 red_qxl_disconnect_cursor_peer: red_qxl_disconnect_display_peer: inputs_channel_detach_tablet: inputs_channel_detach_tablet: main_channel_link: add main channel client main_channel_client_handle_pong: net test: latency 126.308000 ms, bitrate 9060022 bps (8.640310 Mbps) LOW BANDWIDTH red_qxl_set_cursor_peer: inputs_connect: inputs channel client create red_channel_client_disconnect: rcc=0x5626c717a9f0 (channel=0x5626c616a160 type=1 id=0) main_channel_client_on_disconnect: rcc=0x5626c717a9f0 red_client_destroy: destroy client 0x5626c7ab6870 with #channels=4 red_channel_client_disconnect: rcc=0x5626c71529c0 (channel=0x5626c616a880 type=3 id=0) red_qxl_disconnect_cursor_peer: red_channel_client_disconnect: rcc=0x5626c7e59dd0 (channel=0x5626c7bd38d0 type=4 id=0) red_qxl_disconnect_display_peer: red_channel_client_disconnect: rcc=0x5626c7e5e5b0 (channel=0x5626c616a230 type=2 id=0) [root@ovirt qemu]#
- /var/log/glusterfs/cli.log
[2020-04-07 13:25:39.305818] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:39.306091] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:39.306531] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:39.306758] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:39.307081] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:25:54.365843] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:25:54.448475] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:25:54.448600] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:25:54.449348] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:25:54.485288] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:25:54.566723] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:25:54.566792] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:25:54.567275] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:25:54.567305] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:25:54.603618] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:25:54.685718] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:25:54.685821] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:25:54.686537] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.686994] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.687289] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.687774] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.688061] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:25:54.688368] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:09.748782] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:09.831541] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:09.831610] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:09.832596] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:09.871269] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:09.955626] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:09.955726] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:09.956239] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:26:09.956284] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:09.993929] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:10.073634] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:10.073680] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:10.074454] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.074932] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.075241] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.075677] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.075920] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:10.076233] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:22.933923] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:23.016382] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:23.016486] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:23.020067] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:25.149347] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:25.229667] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:25.229716] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:25.230491] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:25.266853] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:25.348774] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:25.348815] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:25.349253] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:26:25.349284] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:25.385951] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:25.468521] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:25.468571] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:25.469286] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.469703] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.469965] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.470381] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.470595] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:25.470877] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:40.546233] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:40.629567] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:40.629667] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:40.630529] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:40.666769] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:40.746861] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:40.746934] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:40.747436] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:26:40.747474] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:40.789486] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:40.881750] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:40.881871] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:40.882615] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.883142] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.883532] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.884020] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.884283] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:40.884615] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:55.966643] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:56.049639] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:56.049735] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:56.050543] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:56.096892] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:56.182275] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:56.182335] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:56.182883] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:26:56.182918] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:26:56.221093] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:26:56.303302] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:26:56.303337] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:26:56.303996] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.304432] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.304706] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.305135] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.305359] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:26:56.305645] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:11.378699] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:11.457630] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:11.457697] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:11.458355] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:11.493643] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:11.571358] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:11.571360] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:11.572137] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:27:11.572167] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:11.608432] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:11.689353] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:11.689437] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:11.690117] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.690569] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.690910] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.691343] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.691577] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:11.691853] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:23.058796] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:23.141280] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:23.141326] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:23.144418] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:26.767415] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:26.846945] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:26.847030] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:26.847818] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:26.883554] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:26.962804] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:26.962905] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:26.963380] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:27:26.963415] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:27.000197] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:27.077981] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:27.078029] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:27.078729] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.079174] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.079448] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.079873] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.080105] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:27.080429] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:42.152150] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:42.233109] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:42.233149] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:42.233818] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:42.269022] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:42.347310] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:42.347347] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:42.347771] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:27:42.347798] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:42.383260] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:42.460650] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:42.460741] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:42.461348] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.461780] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.462085] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.462493] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.462711] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:42.462990] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:57.534379] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:57.614068] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:57.614115] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:57.614835] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:57.650436] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:57.729556] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:57.729652] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:57.730164] I [cli-rpc-ops.c:1297:gf_cli3_1_uuid_get_cbk] 0-cli: Received resp to uuid get [2020-04-07 13:27:57.730204] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:27:57.768743] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:27:57.848062] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:27:57.848153] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:27:57.848830] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.849277] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.849554] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.849983] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.850217] I [cli-rpc-ops.c:867:gf_cli_get_volume_cbk] 0-cli: Received resp to get vol: 0 [2020-04-07 13:27:57.850522] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:28:12.934596] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8 [2020-04-07 13:28:13.015126] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 0 [2020-04-07 13:28:13.015167] I [MSGID: 101190] [event-epoll.c:688:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2020-04-07 13:28:13.015948] I [input.c:31:cli_batch] 0-: Exiting with: 0 [2020-04-07 13:28:13.054894] I [cli.c:844:main] 0-cli: Started running /usr/sbin/gluster with version 6.8
- glusterd.log
[2020-04-07 13:25:22.884457] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore The message "I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req" repeated 7 times between [2020-04-07 13:24:06.750320] and [2020-04-07 13:25:54.449004] The message "I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req" repeated 39 times between [2020-04-07 13:24:06.989378] and [2020-04-07 13:25:54.688025] [2020-04-07 13:26:09.832058] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2020-04-07 13:26:10.074287] I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2020-04-07 13:26:23.017801] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big [2020-04-07 13:26:23.018753] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2020-04-07 13:26:23.019190] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine [2020-04-07 13:26:23.019628] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore [2020-04-07 13:27:23.143376] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2020-04-07 13:27:23.143718] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine [2020-04-07 13:27:23.142636] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big [2020-04-07 13:27:23.144070] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore The message "I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req" repeated 7 times between [2020-04-07 13:26:09.832058] and [2020-04-07 13:27:57.614518] The message "I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req" repeated 39 times between [2020-04-07 13:26:10.074287] and [2020-04-07 13:27:57.850194] [2020-04-07 13:28:13.015558] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2020-04-07 13:28:13.256713] I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2020-04-07 13:28:23.264138] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big [2020-04-07 13:28:23.264884] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2020-04-07 13:28:23.265316] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine [2020-04-07 13:28:23.265644] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore [2020-04-07 13:29:23.389341] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2020-04-07 13:29:23.389702] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine [2020-04-07 13:29:23.388761] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big [2020-04-07 13:29:23.390058] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume vmstore The message "I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req" repeated 6 times between [2020-04-07 13:28:13.015558] and [2020-04-07 13:29:45.291552] The message "I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req" repeated 34 times between [2020-04-07 13:28:13.256713] and [2020-04-07 13:29:45.524677] [2020-04-07 13:30:00.664512] I [MSGID: 106487] [glusterd-handler.c:1516:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2020-04-07 13:30:00.899102] I [MSGID: 106488] [glusterd-handler.c:1577:__glusterd_handle_cli_get_volume] 0-management: Received get vol req [2020-04-07 13:30:23.505881] I [MSGID: 106499] [glusterd-handler.c:4429:__glusterd_handle_status_volume] 0-management: Received status volume req for volume big
- events.log
[2020-04-07 15:21:30,978] INFO [utils - 66:log_event] - {'event': 'CLIENT_CONNECT', 'message': {'client_identifier': '10.50.50.11:49135', 'server_identifier': '10.50.50.11:49153', 'client_uid': 'CTX_ID:815a9519-d88e-402d-9473-19015cca1db8-GRAPH_ID:0-PID:131714-HOST:ovirt.mydomain.local-PC_NAME:engine-client-0-RECON_NO:-0', 'brick_path': '/gluster_bricks/engine/engine', 'subdir_mount': '(null)'}, 'nodeid': 'ea83cff6-4cc6-4399-a096-5892476721f0', 'ts': 1586265690} [2020-04-07 15:21:30,980] INFO [utils - 66:log_event] - {'event': 'CLIENT_DISCONNECT', 'message': {'client_identifier': '10.50.50.11:49135', 'server_identifier': '10.50.50.11:49153', 'client_uid': 'CTX_ID:815a9519-d88e-402d-9473-19015cca1db8-GRAPH_ID:0-PID:131714-HOST:ovirt.mydomain.local-PC_NAME:engine-client-0-RECON_NO:-0', 'brick_path': '/gluster_bricks/engine/engine'}, 'nodeid': 'ea83cff6-4cc6-4399-a096-5892476721f0', 'ts': 1586265690} [2020-04-07 15:36:33,598] INFO [utils - 66:log_event] - {'event': 'CLIENT_CONNECT', 'message': {'client_identifier': '10.50.50.11:49144', 'server_identifier': '10.50.50.11:49155', 'client_uid': 'CTX_ID:e1d09577-50d6-4679-b0e8-0fe554f9fd49-GRAPH_ID:0-PID:139160-HOST:ovirt.mydomain.local-PC_NAME:big-client-0-RECON_NO:-0', 'brick_path': '/gluster_bricks/big/big', 'subdir_mount': '(null)'}, 'nodeid': 'ea83cff6-4cc6-4399-a096-5892476721f0', 'ts': 1586266593} [2020-04-07 15:36:33,599] INFO [utils - 66:log_event] - {'event': 'CLIENT_DISCONNECT', 'message': {'client_identifier': '10.50.50.11:49144', 'server_identifier': '10.50.50.11:49155', 'client_uid': 'CTX_ID:e1d09577-50d6-4679-b0e8-0fe554f9fd49-GRAPH_ID:0-PID:139160-HOST:ovirt.mydomain.local-PC_NAME:big-client-0-RECON_NO:-0', 'brick_path': '/gluster_bricks/big/big'}, 'nodeid': 'ea83cff6-4cc6-4399-a096-5892476721f0', 'ts': 1586266593}
- rhev-data-center-mnt-glusterSD-ovirtst.mydomain.storage:_vmstore.log
[2020-04-07 13:24:16.070237] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4197133: WRITE => -1 gfid=aa68cd63-f3d3-47bd-a177-c7f48bef3c59 fd=0x7f3c5019f348 (Invalid argument) [2020-04-07 13:24:16.074521] I [MSGID: 133022] [shard.c:3674:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=aa68cd63-f3d3-47bd-a177-c7f48bef3c59 from backend [2020-04-07 13:27:28.721262] E [MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on shard 523 failed. Base file gfid = d22530cf-2e50-4059-8924-0aafe38497b1 [No such file or directory] [2020-04-07 13:27:28.721432] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4435189: WRITE => -1 gfid=d22530cf-2e50-4059-8924-0aafe38497b1 fd=0x7f3c4c07ab38 (No such file or directory) [2020-04-07 13:29:16.232630] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-vmstore-client-0: remote operation failed [Invalid argument] [2020-04-07 13:29:16.232675] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4496167: WRITE => -1 gfid=54bafbb5-edd6-42a0-aaba-a5295031be17 fd=0x7f3c48125838 (Invalid argument) [2020-04-07 13:29:16.236352] I [MSGID: 133022] [shard.c:3674:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=54bafbb5-edd6-42a0-aaba-a5295031be17 from backend [2020-04-07 13:34:16.378326] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-vmstore-client-0: remote operation failed [Invalid argument] [2020-04-07 13:34:16.378461] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4698415: WRITE => -1 gfid=6beaa642-e73e-40c7-8adc-64275528b44f fd=0x7f3c48138cc8 (Invalid argument) [2020-04-07 13:34:16.382824] I [MSGID: 133022] [shard.c:3674:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=6beaa642-e73e-40c7-8adc-64275528b44f from backend [2020-04-07 13:39:16.546919] W [MSGID: 114031] [client-rpc-fops_v2.c:680:client4_0_writev_cbk] 0-vmstore-client-0: remote operation failed [Invalid argument] [2020-04-07 13:39:16.547022] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4884227: WRITE => -1 gfid=f32843b7-4131-4536-abe2-f6969aa455ae fd=0x7f3c4813fb58 (Invalid argument) [2020-04-07 13:39:16.550630] I [MSGID: 133022] [shard.c:3674:shard_delete_shards] 0-vmstore-shard: Deleted shards of gfid=f32843b7-4131-4536-abe2-f6969aa455ae from backend
- gluster_bricks-vmstore-vmstore.log
[2020-04-07 13:21:28.128381] I [MSGID: 115036] [server.c:499:server_rpc_notify] 0-vmstore-server: disconnecting connection from CTX_ID:12a5f3
7d-94d3-4b22-aaf6-4ac5bb7567d6-GRAPH_ID:0-PID:131656-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0 [2020-04-07 13:21:28.128565] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-vmstore-server: Shutting down connection CTX_ID:12a5f37d-94
d3-4b22-aaf6-4ac5bb7567d6-GRAPH_ID:0-PID:131656-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0 [2020-04-07 13:24:16.069408] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid arg ument] [2020-04-07 13:24:16.070018] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-vmstore-server: 3528796: WRITEV 18 (aa68cd63- f3d3-47bd-a177-c7f48bef3c59), client: CTX_ID:20665547-59fd-4b4a-94fb-7fdfdca9f5a9-GRAPH_ID:0-PID:17102-HOST:ovirt.mydomain.local-PC_NAME:vmsto re-client-0-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-04-07 13:27:28.719391] W [MSGID: 113020] [posix-helpers.c:1051:posix_gfid_set] 0-vmstore-posix: setting GFID on /gluster_bricks/vmstore /vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] [2020-04-07 13:27:28.719978] E [MSGID: 113020] [posix-entry-ops.c:517:posix_mknod] 0-vmstore-posix: setting gfid on /gluster_bricks/vmstore/v mstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] [2020-04-07 13:29:16.232263] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument] [2020-04-07 13:29:16.232578] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-vmstore-server: 3777485: WRITEV 12 (54bafbb5-edd6-42a0-aaba-a5295031be17), client: CTX_ID:20665547-59fd-4b4a-94fb-7fdfdca9f5a9-GRAPH_ID:0-PID:17102-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-04-07 13:34:16.377622] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument]
- messages
Apr 7 15:25:26 ovirt gluster_bricks-big-big[39632]: [2020-04-07 13:25:26.789943] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-big-posix: write failed: offset 0, [Invalid argument] Apr 7 15:25:26 ovirt gluster_bricks-big-big[39632]: [2020-04-07 13:25:26.790284] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev _cbk] 0-big-server: 20535: WRITEV 1 (e965b2e9-500b-4520-a17e-d00b93a2e1bf), client: CTX_ID:2530ad70-1cec-43a2-bdf8-8c5a085fecb4-GRAPH_ID:0-PI D:41137-HOST:ovirt.mydomain.local-PC_NAME:big-client-0-RECON_NO:-0, error-xlator: big-posix [Invalid argument] Apr 7 15:26:16 ovirt gluster_bricks-engine-engine[10652]: [2020-04-07 13:26:16.183229] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_wr itev] 0-engine-posix: write failed: offset 0, [Invalid argument] Apr 7 15:26:16 ovirt gluster_bricks-engine-engine[10652]: [2020-04-07 13:26:16.183318] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_ writev_cbk] 0-engine-server: 359871: WRITEV 5 (cb0ea391-ed2f-4318-91d3-b8e18f6bdfb7), client: CTX_ID:08f296e3-beeb-4658-8e79-d9add812c3d7-GRA PH_ID:0-PID:11400-HOST:ovirt.mydomain.local-PC_NAME:engine-client-0-RECON_NO:-0, error-xlator: engine-posix [Invalid argument] Apr 7 15:27:28 ovirt gluster_bricks-vmstore-vmstore[10678]: [2020-04-07 13:27:28.719391] W [MSGID: 113020] [posix-helpers.c:1051:posix_gfid_set] 0-vmstore-posix: setting GFID on /gluster_bricks/vmstore/vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] Apr 7 15:27:28 ovirt gluster_bricks-vmstore-vmstore[10678]: [2020-04-07 13:27:28.719978] E [MSGID: 113020] [posix-entry-ops.c:517:posix_mknod] 0-vmstore-posix: setting gfid on /gluster_bricks/vmstore/vmstore/.shard/d22530cf-2e50-4059-8924-0aafe38497b1.523 failed [File exists] Apr 7 15:29:16 ovirt gluster_bricks-data-data[10609]: [2020-04-07 13:29:16.232149] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-data-posix: write failed: offset 0, [Invalid argument]
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/OXEDL5253WJBWI...

On Thu, Apr 9, 2020 at 7:46 AM Krutika Dhananjay <kdhananj@redhat.com> wrote:
On Tue, Apr 7, 2020 at 7:36 PM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
OK. So I set log at least at INFO level on all subsystems and tried a redeploy of Openshift with 3 mater nodes and 7 worker nodes. One worker got the error and VM in paused mode
Apr 7, 2020, 3:27:28 PM VM worker-6 has been paused due to unknown storage error.
The vm has only one 100Gb virtual disk on gluster volume named vmstore
Here below all the logs around time at the different layers. Let me know if you need another log file not yet considered.
From what I see, the matching error is found in
- rhev-data-center-mnt-glusterSD-ovirtst.mydomain.storage:_vmstore.log
[2020-04-07 13:27:28.721262] E [MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-vmstore-shard: Lookup on shard 523 failed. Base file gfid = d22530cf-2e50-4059-8924-0aafe38497b1 [No such file or directory] [2020-04-07 13:27:28.721432] W [fuse-bridge.c:2918:fuse_writev_cbk] 0-glusterfs-fuse: 4435189: WRITE => -1 gfid=d22530cf-2e50-4059-8924-0aafe38497b1 fd=0x7f3c4c07ab38 (No such file or directory)
This ^^, right here is the reason the VM paused. Are you using a plain distribute volume here? Can you share some of the log messages that occur right above these errors? Also, can you check if the file $VMSTORE_BRICKPATH/.glusterfs/d2/25/d22530cf-2e50-4059-8924-0aafe38497b1 exists on the brick?
-Krutika
Thanks for answering Krutika To verify that sharding in some way was "involved" in the problem, I executed a new re-deploy of the 9 Openshift OCP servers, without indeed receiving any error. While with sharding enable I received at least 3-4 errors every deployment run. In particular I deleted the VM disks of the previous VMs to put them on a volume without sharding. Right now the directory is so empty: [root@ovirt ~]# ll -a /gluster_bricks/vmstore/vmstore/.glusterfs/d2/25/ total 8 drwx------. 2 root root 6 Apr 8 16:59 . drwx------. 105 root root 8192 Apr 9 00:50 .. [root@ovirt ~]# Here you can find the entire log (in gzip format) from [2020-04-05 01:20:02.978429] to [2020-04-09 10:45:36.734079] of the vmstore volume https://drive.google.com/file/d/1Dqr7KJMqKdMFg-jvhsDAzvr1xgWtvtnQ/view?usp=s... You will find same error at least in these timestamps below corresponding to engine webadmin events "unknown storage error", taking care that inside the log file the time is UTC, so you have to shift 2hours behind (03:27:28 PM in engine webadmin event corresponds to 13:27:28 in log file) Apr 7, 2020, 3:27:28 PM Apr 7, 2020, 4:38:55 PM Apr 7, 2020, 5:31:02 PM Apr 8, 2020, 8:52:49 AM Apr 8, 2020, 12:05:17 PM Apr 8, 2020, 3:11:10 PM Apr 8, 2020, 3:20:30 PM Apr 8, 2020, 3:26:54 PM Thanks again, and I'm available to re-try on sharding enable volume after modifying anything, eventually Gianluca

On Thu, Apr 9, 2020 at 1:11 PM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:
This ^^, right here is the reason the VM paused. Are you using a plain distribute volume here? Can you share some of the log messages that occur right above these errors? Also, can you check if the file $VMSTORE_BRICKPATH/.glusterfs/d2/25/d22530cf-2e50-4059-8924-0aafe38497b1 exists on the brick?
-Krutika
Thanks for answering Krutika To verify that sharding in some way was "involved" in the problem, I executed a new re-deploy of the 9 Openshift OCP servers, without indeed receiving any error. While with sharding enable I received at least 3-4 errors every deployment run. In particular I deleted the VM disks of the previous VMs to put them on a volume without sharding. Right now the directory is so empty:
[root@ovirt ~]# ll -a /gluster_bricks/vmstore/vmstore/.glusterfs/d2/25/ total 8 drwx------. 2 root root 6 Apr 8 16:59 . drwx------. 105 root root 8192 Apr 9 00:50 .. [root@ovirt ~]#
Here you can find the entire log (in gzip format) from [2020-04-05 01:20:02.978429] to [2020-04-09 10:45:36.734079] of the vmstore volume
https://drive.google.com/file/d/1Dqr7KJMqKdMFg-jvhsDAzvr1xgWtvtnQ/view?usp=s...
You will find same error at least in these timestamps below corresponding to engine webadmin events "unknown storage error", taking care that inside the log file the time is UTC, so you have to shift 2hours behind (03:27:28 PM in engine webadmin event corresponds to 13:27:28 in log file)
Apr 7, 2020, 3:27:28 PM
Apr 7, 2020, 4:38:55 PM
Apr 7, 2020, 5:31:02 PM
Apr 8, 2020, 8:52:49 AM
Apr 8, 2020, 12:05:17 PM
Apr 8, 2020, 3:11:10 PM
Apr 8, 2020, 3:20:30 PM
Apr 8, 2020, 3:26:54 PM
Thanks again, and I'm available to re-try on sharding enable volume after modifying anything, eventually Gianluca
Hi Krutika, did you have the opportunity to verify log content I sent and better understand the reason of sharding errors and possible solution? Thanks, Gianluca
participants (5)
-
Darrell Budic
-
Gianluca Cecchi
-
Krutika Dhananjay
-
Nir Soffer
-
Strahil Nikolov