On Thu, Nov 21, 2019 at 6:03 AM Strahil Nikolov <hunter86_bg(a)yahoo.com> wrote:
Hi All,
another clue in the logs :[2019-11-21 00:29:50.536631] W [MSGID: 114031]
[client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-1: remote operation
failed. Path: /.shard/b0af2b81-22cf-482e-9b2f-c431b6449dae.79
(00000000-0000-0000-0000-000000000000) [Permission denied][2019-11-21 00:29:50.536798] W
[MSGID: 114031] [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-0:
remote operation failed. Path: /.shard/b0af2b81-22cf-482e-9b2f-c431b6449dae.79
(00000000-0000-0000-0000-000000000000) [Permission denied][2019-11-21 00:29:50.536959] W
[MSGID: 114031] [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-2:
remote operation failed. Path: /.shard/b0af2b81-22cf-482e-9b2f-c431b6449dae.79
(00000000-0000-0000-0000-000000000000) [Permission denied][2019-11-21 00:29:50.537007] E
[MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-data_fast-shard: Lookup on
shard 79 failed. Base file gfid = b0af2b81-22cf-482e-9b2f-c431b6449dae [Permission
denied][2019-11-21 00:29:50.537066] W [fuse-bridge.c:2830:fuse_readv_cbk]
0-glusterfs-fuse: 12458: READ => -1 gfid=b0af2b81-22cf-482e-9b2f-c431b6449dae
fd=0x7fc63c00fe18 (Permission denied)[2019-11-21 00:30:01.177665] I [MSGID: 133022]
[shard.c:3674:shard_delete_shards] 0-data_fast-shard: Deleted shards of
gfid=eb103fbf-80dc-425d-882f-1e4efe510db5 from backend[2019-11-21 00:30:13.132756] W
[MSGID: 114031] [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-0:
remote operation failed. Path: /.shard/17c663c2-f582-455b-b806-3b9d01fb2c6c.79
(00000000-0000-0000-0000-000000000000) [Permission denied][2019-11-21 00:30:13.132824] W
[MSGID: 114031] [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-1:
remote operation failed. Path: /.shard/17c663c2-f582-455b-b806-3b9d01fb2c6c.79
(00000000-0000-0000-0000-000000000000) [Permission denied][2019-11-21 00:30:13.133217] W
[MSGID: 114031] [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-2:
remote operation failed. Path: /.shard/17c663c2-f582-455b-b806-3b9d01fb2c6c.79
(00000000-0000-0000-0000-000000000000) [Permission denied][2019-11-21 00:30:13.133238] E
[MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-data_fast-shard: Lookup on
shard 79 failed. Base file gfid = 17c663c2-f582-455b-b806-3b9d01fb2c6c [Permission
denied][2019-11-21 00:30:13.133264] W [fuse-bridge.c:2830:fuse_readv_cbk]
0-glusterfs-fuse: 12660: READ => -1 gfid=17c663c2-f582-455b-b806-3b9d01fb2c6c
fd=0x7fc63c007038 (Permission denied)[2019-11-21 00:30:38.489449] W [MSGID: 114031]
[client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-0: remote operation
failed. Path: /.shard/a10a5ae8-108b-4d78-9e65-cca188c27fc4.6
(00000000-0000-0000-0000-000000000000) [Permission denied][2019-11-21 00:30:38.489520] W
[MSGID: 114031] [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-1:
remote operation failed. Path: /.shard/a10a5ae8-108b-4d78-9e65-cca188c27fc4.6
(00000000-0000-0000-0000-000000000000) [Permission denied][2019-11-21 00:30:38.489669] W
[MSGID: 114031] [client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-data_fast-client-2:
remote operation failed. Path: /.shard/a10a5ae8-108b-4d78-9e65-cca188c27fc4.6
(00000000-0000-0000-0000-000000000000) [Permission denied][2019-11-21 00:30:38.489717] E
[MSGID: 133010] [shard.c:2327:shard_common_lookup_shards_cbk] 0-data_fast-shard: Lookup on
shard 6 failed. Base file gfid = a10a5ae8-108b-4d78-9e65-cca188c27fc4 [Permission
denied][2019-11-21 00:30:38.489777] W [fuse-bridge.c:2830:fuse_readv_cbk]
0-glusterfs-fuse: 12928: READ => -1 gfid=a10a5ae8-108b-4d78-9e65-cca188c27fc4
fd=0x7fc63c01a058 (Permission denied)
Anyone got an idea why is it happening?I checked user/group and selinux permissions - all
OKI even replaced one of the disks and healed , but the result is the same for all my
VMs.
Have you checked the permission for user/group are set correctly
across all the bricks in the cluster?Seem OK:
root@ovirt1 ~]# for i in ovirt{1..3};
do echo $i; ssh $i "find /gluster_bricks/*/*/[0-9]* -not -user vdsm -not -type l
-print" ; echo;echo; doneovirt1
ovirt2
ovirt3
What does ls -la on the images directory from mount of the volume show
you.Attached the output.
Adding Krutika and Rafi as they ran into a similar issue in the past.Most probably the
root's dd is putting the image into the ram ,and then the second dd (via sudo -u vdsm)
gets it from linux cache...
Best Regards,Strahil Nikolov
В сряда, 20 ноември 2019 г., 18:17:18 ч. Гринуич+2, Strahil Nikolov
<hunter86_bg(a)yahoo.com> написа:
Hello All,
my engine is back online , but I'm still having difficulties to make vdsm powerup the
systems.I think that the events generated today can lead me to the right direction(just an
example , many more are there):
VDSM ovirt3.localdomain command SpmStatusVDS failed: Cannot inquire
Lease(name='SDM',
path=u'/rhev/data-center/mnt/glusterSD/gluster1:_data__fast3/ecc3bf0e-8214-45c1-98a6-0afa642e591f/dom_md/leases',
offset=1048576): (2, 'Sanlock get hosts failure', 'No such file or
directory')
I will try to collect a fresh log and see what is it complaining about this time.
Best Regards,Strahil Nikolov
Hi Sahina,
I have a strange situation:
1. When I try to access the file via 'sudo -u vdsm dd if=disk of=test bs=4M' the
command fails on aprox 60MB.
2. If I run same command as root , remove the file and then run again via vdsm user ->
this time no i/o error reported.
My guess is that I need to check what's going on the bricks
themselve ...
Best Regards,
Strahil Nikolov
В вторник, 19 ноември 2019 г., 0:02:16 ч. Гринуич-5, Sahina Bose
<sabose(a)redhat.com> написа:
On Tue, Nov 19, 2019 at 10:10 AM Strahil Nikolov <hunter86_bg(a)yahoo.com> wrote:
Hi Sahina,
Sadly engine logs have no errors.I've got only an I/O error, but in the debug of the
vdsm I can clearly see that "qemu-img" is giving an "OK".During the
upgrade I got some metadata files pending heal, but I have recovered the conflict manually
and should be OK.Today I have defined one of the VMs manually (virsh define) and then
started it , but the issue is the same.It seems to be storage-related issue,as VMs that
are on specific domain can be started , but most of my VMs are on the fast storage domains
and none of them can be started.
After the gluster snapshot restore , the engine is having issues and I have to separately
investigate that (as I poweroff my HostedEngine before creating the snapshot).
The logs can be find at
: https://drive.google.com/open?id=1VAZFZWWrpimDeVuZT0sWFVXy76scr4NM
Any ideas where to look at , as I can definitely read (using "dd if=disk" or
qemu-img info) the disks of the rhel7 VM ?
The vdsm logs have this:2019-11-17 10:21:23,892+0200 INFO (libvirt/events) [virt.vm]
(vmId='b3c4d84a-9784-470c-b70e-7ad7cc45e913') abnormal vm stop device
ua-94f763e9-fd96-4bee-a6b2-31af841a918b error eother (vm:5075)
2019-11-17 10:21:23,892+0200 INFO (libvirt/events) [virt.vm]
(vmId='b3c4d84a-9784-470c-b70e-7ad7cc45e913') CPU stopped: onIOError (vm:6062)
2019-11-17 10:21:23,893+0200 DEBUG (libvirt/events) [jsonrpc.Notification] Sending event
{"params": {"notify_time": 4356025830,
"b3c4d84a-9784-470c-b70e-7ad7cc45e913": {"status":
"WaitForLaunch", "ioerror": {"alias":
"ua-94f763e9-fd96-4bee-a6b2-31af841a918b", "name": "sda",
"path":
"/rhev/data-center/mnt/glusterSD/gluster1:_data__fast/396604d9-2a9e-49cd-9563-fdc79981f67b/images/94f763e9-fd96-4bee-a6b2-31af841a918b/5b1d3113-5cca-4582-9029-634b16338a2f"},
"pauseCode": "EOTHER"}}, "jsonrpc": "2.0",
"method": "|virt|VM_status|b3c4d84a-9784-470c-b70e-7ad7cc45e913"}
(__init__:181)
Can you check the permissions of the file
/rhev/data-center/mnt/glusterSD/gluster1:_data__fast/396604d9-2a9e-49cd-9563-fdc79981f67b/images/94f763e9-fd96-4bee-a6b2-31af841a918b/5b1d3113-5cca-4582-9029-634b16338a2f.
Was it reset after upgrade?
Are you able to copy this file to a different location and try running a VM with this
image?
Any errors in the mount log of gluster1:_data__fast volume?
Best Regards,Strahil Nikolov
В понеделник, 18 ноември 2019 г., 11:38:13 ч. Гринуич+2, Sahina Bose
<sabose(a)redhat.com> написа:
On Mon, Nov 18, 2019 at 2:58 PM Sandro Bonazzola <sbonazzo(a)redhat.com> wrote:
+Sahina Bose +Gobinda Das +Nir Soffer +Tal Nisan can you please help here?
Il giorno dom 17 nov 2019 alle ore 16:00 Strahil Nikolov <hunter86_bg(a)yahoo.com> ha
scritto:
So far,
I have rolled back the engine and the 3 hosts - still cannot manipulate the storage.It
seems that gluster itself is working, but vdsm and the oVirt stack cannot access the
storage - cannot create new VM disks, cannot start a VM and I'm on the verge of
redeploy.
Any errors in vdsm logs? engine logs?
Best Regards,Strahil Nikolov
В събота, 16 ноември 2019 г., 15:40:25 ч. Гринуич+2, Strahil
<hunter86_bg(a)yahoo.com> написа:
I got upgraded to RC3 and now cannot power any VM .
Constantly getting I/O error, but checking at gluster level - I can dd from each disk or
even create a new one.
Removing the HighAvailability doesn't help.
I guess I should restore the engine from the gluster snapshot and rollback via 'yum
history undo last'.
Does anyone else have my issues ?
Best Regards,
Strahil Nikolov
On Nov 13, 2019 15:31, Sandro Bonazzola <sbonazzo(a)redhat.com> wrote:
Il giorno mer 13 nov 2019 alle ore 14:25 Sandro Bonazzola <sbonazzo(a)redhat.com> ha
scritto:
Il giorno mer 13 nov 2019 alle ore 13:56 Florian Schmid <fschmid(a)ubimet.com> ha
scritto:
Hello,
I have a question about bugs, which are flagged as [downstream clone - 4.3.7], but are not
yet released.
I'm talking about this
bug:https://bugzilla.redhat.com/show_bug.cgi?id=1749202
I can't see it in 4.3.7 release notes. Will it be included in a further release
candidate? This fix is very important I think and I can't upgrade yet because of this
bug.
Looking at the bug, the fix was done with $ git tag --contains
12bd5cb1fe7c95e29b4065fca968913722fe9eaaovirt-engine-4.3.6.6
ovirt-engine-4.3.6.7
ovirt-engine-4.3.7.0
ovirt-engine-4.3.7.1
So the fix is already included in release oVirt 4.3.6.
Sent a fix to 4.3.6 release
notes: https://github.com/oVirt/ovirt-site/pull/2143. @Ryan
Barry can you please review?
BR Florian Schmid
Von: "Sandro Bonazzola" <sbonazzo(a)redhat.com>
An: "users" <users(a)ovirt.org>
Gesendet: Mittwoch, 13. November 2019 13:34:59
Betreff: [ovirt-users] [ANN] oVirt 4.3.7 Third Release Candidate is now available for
testing
The oVirt Project is pleased to announce the availability of the oVirt 4.3.7 Third Release
Candidate for testing, as of November 13th, 2019.
This update is a release candidate of the seventh in a series of stabilization updates to
the 4.3 series.
This is pre-release software. This pre-release should not to be used in production.
This release is available now on x86_64 architecture for:
* Red Hat Enterprise Linux 7.7 or later (but <8)
* CentOS Linux (or similar) 7.7 or later (but <8)
This release supports Hypervisor Hosts on x86_64 and ppc64le architectures for:
* Red Hat Enterprise Linux 7.7 or later (but <8)
* CentOS Linux (or similar) 7.7 or later (but <8)
* oVirt Node 4.3 (available for x86_64 only) has been built consuming CentOS 7.7 Release
See the release notes [1] for known issues, new features and bugs fixed.
While testing this release candidate please note that oVirt node now includes:
- ansible 2.9.0
- GlusterFS 6.6
Notes:
- oVirt Appliance is already available
- oVirt Node is already available
Additional Resources:
* Read more about the oVirt 4.3.7 release highlights:
http://www.ovirt.org/release/4.3.7/
* Get more oVirt Project updates on Twitter:
https://twitter.com/ovirt
* Check out the latest project news on the oVirt
blog:http://www.ovirt.org/blog/
[1]
http://www.ovirt.org/release/4.3.7/
[2]
http://resources.ovirt.org/pub/ovirt-4.3-pre/iso/
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA
sbonazzo(a)redhat.com
| |
Red Hat respects your work life balance. Therefore there is no need to answer this email
out of your office hours.
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/24QUREJPZHT...
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA
sbonazzo(a)redhat.com
| |
Red Hat respects your work life balance. Therefore there is no need to answer this email
out of your office hours.
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA
sbonazzo(a)redhat.com
| |
Red Hat respects your work life balance. Therefore there is no need to answer this email
out of your office hours.
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA
sbonazzo(a)redhat.com
| |
Red Hat respects your work life balance. Therefore there is no need to answer this email
out of your office hours.
_______________________________________________
Users mailing list -- users(a)ovirt.org
To unsubscribe send an email to users-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3ZPGKJ4JJWV...