On Thu, Nov 21, 2019 at 6:03 AM Strahil Nikolov <hunter86_bg(a)yahoo.com>
wrote:
Hi All,
another clue in the logs :
[2019-11-21 00:29:50.536631] W [MSGID: 114031]
[client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-1:
remote operation failed. Path:
/.shard/b0af2b81-22cf-482e-9b2f-c431b6449dae.79
(00000000-0000-0000-0000-000000000000) [Permission denied]
[2019-11-21 00:29:50.536798] W [MSGID: 114031]
[client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-0:
remote operation failed. Path:
/.shard/b0af2b81-22cf-482e-9b2f-c431b6449dae.79
(00000000-0000-0000-0000-000000000000) [Permission denied]
[2019-11-21 00:29:50.536959] W [MSGID: 114031]
[client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-2:
remote operation failed. Path:
/.shard/b0af2b81-22cf-482e-9b2f-c431b6449dae.79
(00000000-0000-0000-0000-000000000000) [Permission denied]
[2019-11-21 00:29:50.537007] E [MSGID: 133010]
[shard.c:2327:shard_common_lookup_shards_cbk] 0-data_fast-shard: Lookup on
shard 79 failed. Base file gfid = b0af2b81-22cf-482e-9b2f-c431b6449dae
[Permission denied]
[2019-11-21 00:29:50.537066] W [fuse-bridge.c:2830:fuse_readv_cbk]
0-glusterfs-fuse: 12458: READ => -1
gfid=b0af2b81-22cf-482e-9b2f-c431b6449dae fd=0x7fc63c00fe18 (Permission
denied)
[2019-11-21 00:30:01.177665] I [MSGID: 133022]
[shard.c:3674:shard_delete_shards] 0-data_fast-shard: Deleted shards of
gfid=eb103fbf-80dc-425d-882f-1e4efe510db5 from backend
[2019-11-21 00:30:13.132756] W [MSGID: 114031]
[client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-0:
remote operation failed. Path:
/.shard/17c663c2-f582-455b-b806-3b9d01fb2c6c.79
(00000000-0000-0000-0000-000000000000) [Permission denied]
[2019-11-21 00:30:13.132824] W [MSGID: 114031]
[client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-1:
remote operation failed. Path:
/.shard/17c663c2-f582-455b-b806-3b9d01fb2c6c.79
(00000000-0000-0000-0000-000000000000) [Permission denied]
[2019-11-21 00:30:13.133217] W [MSGID: 114031]
[client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-2:
remote operation failed. Path:
/.shard/17c663c2-f582-455b-b806-3b9d01fb2c6c.79
(00000000-0000-0000-0000-000000000000) [Permission denied]
[2019-11-21 00:30:13.133238] E [MSGID: 133010]
[shard.c:2327:shard_common_lookup_shards_cbk] 0-data_fast-shard: Lookup on
shard 79 failed. Base file gfid = 17c663c2-f582-455b-b806-3b9d01fb2c6c
[Permission denied]
[2019-11-21 00:30:13.133264] W [fuse-bridge.c:2830:fuse_readv_cbk]
0-glusterfs-fuse: 12660: READ => -1
gfid=17c663c2-f582-455b-b806-3b9d01fb2c6c fd=0x7fc63c007038 (Permission
denied)
[2019-11-21 00:30:38.489449] W [MSGID: 114031]
[client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-0:
remote operation failed. Path:
/.shard/a10a5ae8-108b-4d78-9e65-cca188c27fc4.6
(00000000-0000-0000-0000-000000000000) [Permission denied]
[2019-11-21 00:30:38.489520] W [MSGID: 114031]
[client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-1:
remote operation failed. Path:
/.shard/a10a5ae8-108b-4d78-9e65-cca188c27fc4.6
(00000000-0000-0000-0000-000000000000) [Permission denied]
[2019-11-21 00:30:38.489669] W [MSGID: 114031]
[client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-2:
remote operation failed. Path:
/.shard/a10a5ae8-108b-4d78-9e65-cca188c27fc4.6
(00000000-0000-0000-0000-000000000000) [Permission denied]
[2019-11-21 00:30:38.489717] E [MSGID: 133010]
[shard.c:2327:shard_common_lookup_shards_cbk] 0-data_fast-shard: Lookup on
shard 6 failed. Base file gfid = a10a5ae8-108b-4d78-9e65-cca188c27fc4
[Permission denied]
[2019-11-21 00:30:38.489777] W [fuse-bridge.c:2830:fuse_readv_cbk]
0-glusterfs-fuse: 12928: READ => -1
gfid=a10a5ae8-108b-4d78-9e65-cca188c27fc4 fd=0x7fc63c01a058 (Permission
denied)
Anyone got an idea why is it happening?
I checked user/group and selinux permissions - all OK
I even replaced one of the disks and healed , but the result is the same
for all my VMs.
Have you checked the permission for user/group are set correctly across all
the bricks in the cluster?
What does ls -la on the images directory from mount of the volume show you.
Adding Krutika and Rafi as they ran into a similar issue in the past.
Best Regards,
Strahil Nikolov
В сряда, 20 ноември 2019 г., 18:17:18 ч. Гринуич+2, Strahil Nikolov <
hunter86_bg(a)yahoo.com> написа:
Hello All,
my engine is back online , but I'm still having difficulties to make vdsm
powerup the systems.
I think that the events generated today can lead me to the right
direction(just an example , many more are there):
VDSM ovirt3.localdomain command SpmStatusVDS failed: Cannot inquire
Lease(name='SDM',
path=u'/rhev/data-center/mnt/glusterSD/gluster1:_data__fast3/ecc3bf0e-8214-45c1-98a6-0afa642e591f/dom_md/leases',
offset=1048576): (2, 'Sanlock get hosts failure', 'No such file or
directory')
I will try to collect a fresh log and see what is it complaining about
this time.
Best Regards,
Strahil Nikolov
>Hi Sahina,
>I have a strange situation:
>1. When I try to access the file via 'sudo -u vdsm dd if=disk of=test
bs=4M' the command fails on aprox 60MB.
>2. If I run same command as root , remove the file and then run again via
vdsm user -> this time no i/o error reported.
>My guess is that I need to check what's going on the bricks themselve ...
>Best Regards,
>Strahil Nikolov
В вторник, 19 ноември 2019 г., 0:02:16 ч. Гринуич-5, Sahina Bose <
sabose(a)redhat.com> написа:
On Tue, Nov 19, 2019 at 10:10 AM Strahil Nikolov <hunter86_bg(a)yahoo.com>
wrote:
Hi Sahina,
Sadly engine logs have no errors.
I've got only an I/O error, but in the debug of the vdsm I can clearly see
that "qemu-img" is giving an "OK".
During the upgrade I got some metadata files pending heal, but I have
recovered the conflict manually and should be OK.
Today I have defined one of the VMs manually (virsh define) and then
started it , but the issue is the same.
It seems to be storage-related issue,as VMs that are on specific domain
can be started , but most of my VMs are on the fast storage domains and
none of them can be started.
After the gluster snapshot restore , the engine is having issues and I
have to separately investigate that (as I poweroff my HostedEngine before
creating the snapshot).
The logs can be find at :
https://drive.google.com/open?id=1VAZFZWWrpimDeVuZT0sWFVXy76scr4NM
Any ideas where to look at , as I can definitely read (using "dd if=disk"
or qemu-img info) the disks of the rhel7 VM ?
The vdsm logs have this:
2019-11-17 10:21:23,892+0200 INFO (libvirt/events) [virt.vm]
(vmId='b3c4d84a-9784-470c-b70e-7ad7cc45e913') abnormal vm stop device
ua-94f763e9-fd96-4bee-a6b2-31af841a918b error eother (vm:5075)
2019-11-17 10:21:23,892+0200 INFO (libvirt/events) [virt.vm]
(vmId='b3c4d84a-9784-470c-b70e-7ad7cc45e913') CPU stopped: onIOError
(vm:6062)
2019-11-17 10:21:23,893+0200 DEBUG (libvirt/events) [jsonrpc.Notification]
Sending event {"params": {"notify_time": 4356025830,
"b3c4d84a-9784-470c-b70e-7ad7cc45e913": {"status":
"WaitForLaunch",
"ioerror": {"alias":
"ua-94f763e9-fd96-4bee-a6b2-31af841a918b", "name":
"sda", "path":
"/rhev/data-center/mnt/glusterSD/gluster1:_data__fast/396604d9-2a9e-49cd-9563-fdc79981f67b/images/94f763e9-fd96-4bee-a6b2-31af841a918b/5b1d3113-5cca-4582-9029-634b16338a2f"},
"pauseCode": "EOTHER"}}, "jsonrpc": "2.0",
"method":
"|virt|VM_status|b3c4d84a-9784-470c-b70e-7ad7cc45e913"} (__init__:181)
Can you check the permissions of the file
/rhev/data-center/mnt/glusterSD/gluster1:_data__fast/396604d9-2a9e-49cd-9563-fdc79981f67b/images/94f763e9-fd96-4bee-a6b2-31af841a918b/5b1d3113-5cca-4582-9029-634b16338a2f.
Was it reset after upgrade?
Are you able to copy this file to a different location and try running a
VM with this image?
Any errors in the mount log of gluster1:_data__fast volume?
Best Regards,
Strahil Nikolov
В понеделник, 18 ноември 2019 г., 11:38:13 ч. Гринуич+2, Sahina Bose <
sabose(a)redhat.com> написа:
On Mon, Nov 18, 2019 at 2:58 PM Sandro Bonazzola <sbonazzo(a)redhat.com>
wrote:
+Sahina Bose <sabose(a)redhat.com> +Gobinda Das <godas(a)redhat.com> +Nir
Soffer <nsoffer(a)redhat.com> +Tal Nisan <tnisan(a)redhat.com> can you please
help here?
Il giorno dom 17 nov 2019 alle ore 16:00 Strahil Nikolov <
hunter86_bg(a)yahoo.com> ha scritto:
So far,
I have rolled back the engine and the 3 hosts - still cannot manipulate
the storage.
It seems that gluster itself is working, but vdsm and the oVirt stack
cannot access the storage - cannot create new VM disks, cannot start a VM
and I'm on the verge of redeploy.
Any errors in vdsm logs? engine logs?
Best Regards,
Strahil Nikolov
В събота, 16 ноември 2019 г., 15:40:25 ч. Гринуич+2, Strahil <
hunter86_bg(a)yahoo.com> написа:
I got upgraded to RC3 and now cannot power any VM .
Constantly getting I/O error, but checking at gluster level - I can dd
from each disk or even create a new one.
Removing the HighAvailability doesn't help.
I guess I should restore the engine from the gluster snapshot and
rollback via 'yum history undo last'.
Does anyone else have my issues ?
Best Regards,
Strahil Nikolov
On Nov 13, 2019 15:31, Sandro Bonazzola <sbonazzo(a)redhat.com> wrote:
Il giorno mer 13 nov 2019 alle ore 14:25 Sandro Bonazzola <
sbonazzo(a)redhat.com> ha scritto:
Il giorno mer 13 nov 2019 alle ore 13:56 Florian Schmid <
fschmid(a)ubimet.com> ha scritto:
Hello,
I have a question about bugs, which are flagged as [downstream clone -
4.3.7], but are not yet released.
I'm talking about this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1749202
I can't see it in 4.3.7 release notes. Will it be included in a further
release candidate? This fix is very important I think and I can't upgrade
yet because of this bug.
Looking at the bug, the fix was done with $ git tag --contains
12bd5cb1fe7c95e29b4065fca968913722fe9eaa
ovirt-engine-4.3.6.6
ovirt-engine-4.3.6.7
ovirt-engine-4.3.7.0
ovirt-engine-4.3.7.1
So the fix is already included in release oVirt 4.3.6.
Sent a fix to 4.3.6 release notes:
https://github.com/oVirt/ovirt-site/pull/2143. @Ryan Barry
<rbarry(a)redhat.com> can you please review?
BR Florian Schmid
------------------------------
*Von: *"Sandro Bonazzola" <sbonazzo(a)redhat.com>
*An: *"users" <users(a)ovirt.org>
*Gesendet: *Mittwoch, 13. November 2019 13:34:59
*Betreff: *[ovirt-users] [ANN] oVirt 4.3.7 Third Release Candidate is now
available for testing
The oVirt Project is pleased to announce the availability of the oVirt
4.3.7 Third Release Candidate for testing, as of November 13th, 2019.
This update is a release candidate of the seventh in a series of
stabilization updates to the 4.3 series.
This is pre-release software. This pre-release should not to be used in
production.
This release is available now on x86_64 architecture for:
* Red Hat Enterprise Linux 7.7 or later (but <8)
* CentOS Linux (or similar) 7.7 or later (but <8)
This release supports Hypervisor Hosts on x86_64 and ppc64le architectures
for:
* Red Hat Enterprise Linux 7.7 or later (but <8)
* CentOS Linux (or similar) 7.7 or later (but <8)
* oVirt Node 4.3 (available for x86_64 only) has been built consuming
CentOS 7.7 Release
See the release notes [1] for known issues, new features and bugs fixed.
While testing this release candidate please note that oVirt node now
includes:
- ansible 2.9.0
- GlusterFS 6.6
Notes:
- oVirt Appliance is already available
- oVirt Node is already available
Additional Resources:
* Read more about the oVirt 4.3.7 release highlights:
http://www.ovirt.org/release/4.3.7/
* Get more oVirt Project updates on Twitter:
https://twitter.com/ovirt
* Check out the latest project news on the oVirt blog:
http://www.ovirt.org/blog/
[1]
http://www.ovirt.org/release/4.3.7/
[2]
http://resources.ovirt.org/pub/ovirt-4.3-pre/iso/
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA <
https://www.redhat.com/>
sbonazzo(a)redhat.com
<
https://www.redhat.com/>*Red Hat respects your work life balance.
Therefore there is no need to answer this email out of your office hours.*
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/24QUREJPZHT...
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA <
https://www.redhat.com/>
sbonazzo(a)redhat.com
<
https://www.redhat.com/>*Red Hat respects your work life balance.
Therefore there is no need to answer this email out of your office hours.
<
https://mojo.redhat.com/docs/DOC-1199578>*
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA <
https://www.redhat.com/>
sbonazzo(a)redhat.com
<
https://www.redhat.com/>*Red Hat respects your work life balance.
Therefore there is no need to answer this email out of your office hours.
<
https://mojo.redhat.com/docs/DOC-1199578>*
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA <
https://www.redhat.com/>
sbonazzo(a)redhat.com
<
https://www.redhat.com/>*Red Hat respects your work life balance.
Therefore there is no need to answer this email out of your office hours.
<
https://mojo.redhat.com/docs/DOC-1199578>*