Hello,
I recently added 6 hosts to an existing oVirt compute/gluster cluster.
Prior to this attempted addition, my cluster had 3 Hypervisor hosts and 3
gluster bricks which made up a single gluster volume (replica 3 volume) . I
added the additional hosts and made a brick on 3 of the new hosts and
attempted to make a new replica 3 volume. I had difficulty creating the
new volume. So, I decided that I would make a new compute/gluster cluster
for each set of 3 new hosts.
I removed the 6 new hosts from the existing oVirt Compute/Gluster Cluster
leaving the 3 original hosts in place with their bricks. At that point my
original bricks went down and came back up . The volume showed entries that
needed healing. At that point I ran gluster volume heal images3 full, etc.
The volume shows no unhealed entries. I also corrected some peer errors.
However, I am unable to copy disks, move disks to another domain, export
disks, etc. It appears that the engine cannot locate disks properly and I
get storage I/O errors.
I have detached and removed the oVirt Storage Domain. I reimported the
domain and imported 2 VMs, But the VM disks exhibit the same behaviour and
won't run from the hard disk.
I get errors such as this
VDSM ov05 command HSMGetAllTasksStatusesVDS failed: low level Image copy
failed: ("Command ['/usr/bin/qemu-img', 'convert', '-p',
'-t', 'none',
'-T', 'none', '-f', 'raw',
u'/rhev/data-center/mnt/glusterSD/192.168.24.18:_images3/5fe3ad3f-2d21-404c-832e-4dc7318ca10d/images/3ea5afbd-0fe0-4c09-8d39-e556c66a8b3d/fe6eab63-3b22-4815-bfe6-4a0ade292510',
'-O', 'raw',
u'/rhev/data-center/mnt/192.168.24.13:_stor_import1/1ab89386-a2ba-448b-90ab-bc816f55a328/images/f707a218-9db7-4e23-8bbd-9b12972012b6/d6591ec5-3ede-443d-bd40-93119ca7c7d5']
failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading
sector 135168: Transport endpoint is not connected\\nqemu-img: error while
reading sector 131072: Transport endpoint is not connected\\nqemu-img:
error while reading sector 139264: Transport endpoint is not
connected\\nqemu-img: error while reading sector 143360: Transport endpoint
is not connected\\nqemu-img: error while reading sector 147456: Transport
endpoint is not connected\\nqemu-img: error while reading sector 155648:
Transport endpoint is not connected\\nqemu-img: error while reading sector
151552: Transport endpoint is not connected\\nqemu-img: error while reading
sector 159744: Transport endpoint is not connected\\n')",)
oVirt version is 4.3.82-1.el7
OS CentOS Linux release 7.7.1908 (Core)
The Gluster Cluster has been working very well until this incident.
Please help.
Thank You
Charles Williams
Show replies by date
Log to the oVirt cluster and provide the output of:
gluster pool list
gluster volume list
for i in $(gluster volume list); do echo $i;echo; gluster volume info $i;
echo;echo;gluster volume status $i;echo;echo;echo;done
ls -l /rhev/data-center/mnt/glusterSD/
Best Regards,
Strahil Nikolov
На 18 юни 2020 г. 19:17:46 GMT+03:00, C Williams <cwilliams3320(a)gmail.com> написа:
>Hello,
>
>I recently added 6 hosts to an existing oVirt compute/gluster cluster.
>
>Prior to this attempted addition, my cluster had 3 Hypervisor hosts and
>3
>gluster bricks which made up a single gluster volume (replica 3 volume)
>. I
>added the additional hosts and made a brick on 3 of the new hosts and
>attempted to make a new replica 3 volume. I had difficulty creating
>the
>new volume. So, I decided that I would make a new compute/gluster
>cluster
>for each set of 3 new hosts.
>
>I removed the 6 new hosts from the existing oVirt Compute/Gluster
>Cluster
>leaving the 3 original hosts in place with their bricks. At that point
>my
>original bricks went down and came back up . The volume showed entries
>that
>needed healing. At that point I ran gluster volume heal images3 full,
>etc.
>The volume shows no unhealed entries. I also corrected some peer
>errors.
>
>However, I am unable to copy disks, move disks to another domain,
>export
>disks, etc. It appears that the engine cannot locate disks properly and
>I
>get storage I/O errors.
>
>I have detached and removed the oVirt Storage Domain. I reimported the
>domain and imported 2 VMs, But the VM disks exhibit the same behaviour
>and
>won't run from the hard disk.
>
>
>I get errors such as this
>
>VDSM ov05 command HSMGetAllTasksStatusesVDS failed: low level Image
>copy
>failed: ("Command ['/usr/bin/qemu-img', 'convert', '-p',
'-t', 'none',
>'-T', 'none', '-f', 'raw',
>u'/rhev/data-center/mnt/glusterSD/192.168.24.18:_images3/5fe3ad3f-2d21-404c-832e-4dc7318ca10d/images/3ea5afbd-0fe0-4c09-8d39-e556c66a8b3d/fe6eab63-3b22-4815-bfe6-4a0ade292510',
>'-O', 'raw',
>u'/rhev/data-center/mnt/192.168.24.13:_stor_import1/1ab89386-a2ba-448b-90ab-bc816f55a328/images/f707a218-9db7-4e23-8bbd-9b12972012b6/d6591ec5-3ede-443d-bd40-93119ca7c7d5']
>failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading
>sector 135168: Transport endpoint is not connected\\nqemu-img: error
>while
>reading sector 131072: Transport endpoint is not connected\\nqemu-img:
>error while reading sector 139264: Transport endpoint is not
>connected\\nqemu-img: error while reading sector 143360: Transport
>endpoint
>is not connected\\nqemu-img: error while reading sector 147456:
>Transport
>endpoint is not connected\\nqemu-img: error while reading sector
>155648:
>Transport endpoint is not connected\\nqemu-img: error while reading
>sector
>151552: Transport endpoint is not connected\\nqemu-img: error while
>reading
>sector 159744: Transport endpoint is not connected\\n')",)
>
>oVirt version is 4.3.82-1.el7
>OS CentOS Linux release 7.7.1908 (Core)
>
>The Gluster Cluster has been working very well until this incident.
>
>Please help.
>
>Thank You
>
>Charles Williams