HostedEngine cleaned up

Hi, We have a situation where the HostedEngine was cleaned up and the VMs are no longer running. Looking at the logs we can see the drive files as: 2019-03-26T07:42:46.915838Z qemu-kvm: -drive file=/rhev/data-center/mnt/glusterSD/glustermount.goku:_vmstore/9f8ef3f6-53f2-4b02-8a6b-e171b000b420/images/b2b872cd-b468-4f14-ae20-555ed823e84b/76ed4113-51b6-44fd-a3cd-3bd64bf93685,format=qcow2,if=none,id=drive-ua-b2b872cd-b468-4f14-ae20-555ed823e84b,serial=b2b872cd-b468-4f14-ae20-555ed823e84b,werror=stop,rerror=stop,cache=none,aio=native: 'serial' is deprecated, please use the corresponding option of '-device' instead I assume this is the disk was writing to before it went down. Trying to list the file gives an error and the file is not there; ls -l /rhev/data-center/mnt/glusterSD/glustermount.goku:_vmstore/9f8ef3f6-53f2-4b02-8a6b-e171b000b420/images/b2b872cd-b468-4f14-ae20-555ed823e84b/76ed4113-51b6-44fd-a3cd-3bd64bf93685 Is there a way we can recover the VM's disk images ? NOTE: No HostedEngine backups -- Regards, Sakhi Hadebe

On Thu, Apr 11, 2019 at 9:46 AM Sakhi Hadebe <sakhi@sanren.ac.za> wrote:
Hi,
We have a situation where the HostedEngine was cleaned up and the VMs are no longer running. Looking at the logs we can see the drive files as:
Do you have any guess on what really happened? Are you sure that the disks really disappeared? Please notice that the symlinks under rhev/data-center/mnt/glusterSD/glustermount... are created on the fly only when needed. Are you sure that your host is correctly connecting the gluster storage domain?
2019-03-26T07:42:46.915838Z qemu-kvm: -drive file=/rhev/data-center/mnt/glusterSD/glustermount.goku:_vmstore/9f8ef3f6-53f2-4b02-8a6b-e171b000b420/images/b2b872cd-b468-4f14-ae20-555ed823e84b/76ed4113-51b6-44fd-a3cd-3bd64bf93685,format=qcow2,if=none,id=drive-ua-b2b872cd-b468-4f14-ae20-555ed823e84b,serial=b2b872cd-b468-4f14-ae20-555ed823e84b,werror=stop,rerror=stop,cache=none,aio=native: 'serial' is deprecated, please use the corresponding option of '-device' instead
I assume this is the disk was writing to before it went down. Trying to list the file gives an error and the file is not there; ls -l /rhev/data-center/mnt/glusterSD/glustermount.goku:_vmstore/9f8ef3f6-53f2-4b02-8a6b-e171b000b420/images/b2b872cd-b468-4f14-ae20-555ed823e84b/76ed4113-51b6-44fd-a3cd-3bd64bf93685
Is there a way we can recover the VM's disk images ?
NOTE: No HostedEngine backups
-- Regards, Sakhi Hadebe _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FPTPLMY4TC4FVY...

What happened is the engine's root filesystem had filled up. My colleague tried to resize the root lvm. The engine then did not come back. In trying to resolve that he cleaned up the engine and tried to re-install it, no luck in doing that. That brought down all the VMs. All VMs are down. we trying to move them into one of the standalone kvm host. We have been trying to locate the VM disk images, with no luck. According to the of the VM xml configuration file.the disk file is /rhev/data-center/mnt/glusterSD/glustermount.goku:_vmstore/9f8ef3f6-53f2-4b02-8a6b-e171b000b420/images/b2b872cd-b468-4f14-ae20-555ed823e84b/76ed4113-51b6-44fd-a3cd-3bd64bf93685 Unfortunately we can find it and the solution on teh forum states that we can only find in the associated logical volume, but i think only when teh vm is running. The disk images we have been trying to boot up from are the one's we got from the gluster bricks, but the are far small that real images and can't boot On Thu, Apr 11, 2019 at 6:13 PM Simone Tiraboschi <stirabos@redhat.com> wrote:
On Thu, Apr 11, 2019 at 9:46 AM Sakhi Hadebe <sakhi@sanren.ac.za> wrote:
Hi,
We have a situation where the HostedEngine was cleaned up and the VMs are no longer running. Looking at the logs we can see the drive files as:
Do you have any guess on what really happened? Are you sure that the disks really disappeared?
Please notice that the symlinks under rhev/data-center/mnt/glusterSD/glustermount... are created on the fly only when needed.
Are you sure that your host is correctly connecting the gluster storage domain?
2019-03-26T07:42:46.915838Z qemu-kvm: -drive file=/rhev/data-center/mnt/glusterSD/glustermount.goku:_vmstore/9f8ef3f6-53f2-4b02-8a6b-e171b000b420/images/b2b872cd-b468-4f14-ae20-555ed823e84b/76ed4113-51b6-44fd-a3cd-3bd64bf93685,format=qcow2,if=none,id=drive-ua-b2b872cd-b468-4f14-ae20-555ed823e84b,serial=b2b872cd-b468-4f14-ae20-555ed823e84b,werror=stop,rerror=stop,cache=none,aio=native: 'serial' is deprecated, please use the corresponding option of '-device' instead
I assume this is the disk was writing to before it went down. Trying to list the file gives an error and the file is not there; ls -l /rhev/data-center/mnt/glusterSD/glustermount.goku:_vmstore/9f8ef3f6-53f2-4b02-8a6b-e171b000b420/images/b2b872cd-b468-4f14-ae20-555ed823e84b/76ed4113-51b6-44fd-a3cd-3bd64bf93685
Is there a way we can recover the VM's disk images ?
NOTE: No HostedEngine backups
-- Regards, Sakhi Hadebe _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FPTPLMY4TC4FVY...
-- Regards, Sakhi Hadebe Engineer: South African National Research Network (SANReN)Competency Area, Meraka, CSIR Tel: +27 12 841 2308 <+27128414213> Fax: +27 12 841 4223 <+27128414223> Cell: +27 71 331 9622 <+27823034657> Email: sakhi@sanren.ac.za <shadebe@csir.co.za>

Il Gio 11 Apr 2019, 19:12 Sakhi Hadebe <sakhi@sanren.ac.za> ha scritto:
What happened is the engine's root filesystem had filled up. My colleague tried to resize the root lvm. The engine then did not come back. In trying to resolve that he cleaned up the engine and tried to re-install it, no luck in doing that.
That brought down all the VMs. All VMs are down. we trying to move them into one of the standalone kvm host. We have been trying to locate the VM disk images, with no luck.
According to the of the VM xml configuration file.the disk file is /rhev/data-center/mnt/glusterSD/glustermount.goku:_vmstore/9f8ef3f6-53f2-4b02-8a6b-e171b000b420/images/b2b872cd-b468-4f14-ae20-555ed823e84b/76ed4113-51b6-44fd-a3cd-3bd64bf93685
Unfortunately we can find it and the solution on teh forum states that we can only find in the associated logical volume, but i think only when teh vm is running.
The disk images we have been trying to boot up from are the one's we got from the gluster bricks, but the are far small that real images and can't boot
I'd strongly suggest to deploy a new environment over a new storage domain and import there the existing hosts; if you configure the cluster in the new engine for virt+gluster, the engine should be able to automatically detect the gluster volumes there and then you will be able to import the existing storage domains into the new engine. All the definitions of the VMs are periodically (1h) saved also on a special volume on the storage domain called OVF_STORE so you will find your existing VMs into the new engine and you will be able to start them from there.
On Thu, Apr 11, 2019 at 6:13 PM Simone Tiraboschi <stirabos@redhat.com> wrote:
On Thu, Apr 11, 2019 at 9:46 AM Sakhi Hadebe <sakhi@sanren.ac.za> wrote:
Hi,
We have a situation where the HostedEngine was cleaned up and the VMs are no longer running. Looking at the logs we can see the drive files as:
Do you have any guess on what really happened? Are you sure that the disks really disappeared?
Please notice that the symlinks under rhev/data-center/mnt/glusterSD/glustermount... are created on the fly only when needed.
Are you sure that your host is correctly connecting the gluster storage domain?
2019-03-26T07:42:46.915838Z qemu-kvm: -drive file=/rhev/data-center/mnt/glusterSD/glustermount.goku:_vmstore/9f8ef3f6-53f2-4b02-8a6b-e171b000b420/images/b2b872cd-b468-4f14-ae20-555ed823e84b/76ed4113-51b6-44fd-a3cd-3bd64bf93685,format=qcow2,if=none,id=drive-ua-b2b872cd-b468-4f14-ae20-555ed823e84b,serial=b2b872cd-b468-4f14-ae20-555ed823e84b,werror=stop,rerror=stop,cache=none,aio=native: 'serial' is deprecated, please use the corresponding option of '-device' instead
I assume this is the disk was writing to before it went down. Trying to list the file gives an error and the file is not there; ls -l /rhev/data-center/mnt/glusterSD/glustermount.goku:_vmstore/9f8ef3f6-53f2-4b02-8a6b-e171b000b420/images/b2b872cd-b468-4f14-ae20-555ed823e84b/76ed4113-51b6-44fd-a3cd-3bd64bf93685
Is there a way we can recover the VM's disk images ?
NOTE: No HostedEngine backups
-- Regards, Sakhi Hadebe _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/FPTPLMY4TC4FVY...
-- Regards, Sakhi Hadebe
Engineer: South African National Research Network (SANReN)Competency Area, Meraka, CSIR
Tel: +27 12 841 2308 <+27128414213> Fax: +27 12 841 4223 <+27128414223> Cell: +27 71 331 9622 <+27823034657> Email: sakhi@sanren.ac.za <shadebe@csir.co.za>

Adding to what me and my colleague shared I am able to locate the disk images of the VMs, I copied some of them and tried to boot them from another standalone kvm host, however booting the disk images wasn't succesful as it landed on a rescue mode. The strange part is that the VM disk images are 64MB in size, which doesn't seem to be normal for a disk image(see the below command extract). root@gohan 019a7072-43d5-44b5-bb86-7a7327f02087]# pwd /gluster_bricks/data/data/659de125-5671-4777-b27e-974aec0a4c9c/images/019a7072-43d5-44b5-bb86-7a7327f02087 [root@gohan 019a7072-43d5-44b5-bb86-7a7327f02087]# ll -h total 66M -rw-rw----. 2 vdsm kvm 64M Mar 22 13:48 f5f97478-6ccb-48bc-93b7-2fd5939f40bf -rw-rw----. 2 vdsm kvm 1.0M Mar 4 12:10 f5f97478-6ccb-48bc-93b7-2fd5939f40bf.lease -rw-r--r--. 2 vdsm kvm 317 Mar 22 11:06 f5f97478-6ccb-48bc-93b7-2fd5939f40bf.meta Please share insights on how I can reconstruct the disk image so that it can become bootable on the kvm host. Thanks in advance for the reply.

On Fri, Apr 12, 2019 at 11:16 AM <tau@sanren.ac.za> wrote:
Adding to what me and my colleague shared
I am able to locate the disk images of the VMs, I copied some of them and tried to boot them from another standalone kvm host, however booting the disk images wasn't succesful as it landed on a rescue mode. The strange part is that the VM disk images are 64MB in size, which doesn't seem to be normal for a disk image(see the below command extract).
root@gohan 019a7072-43d5-44b5-bb86-7a7327f02087]# pwd
/gluster_bricks/data/data/659de125-5671-4777-b27e-974aec0a4c9c/images/019a7072-43d5-44b5-bb86-7a7327f02087 [root@gohan 019a7072-43d5-44b5-bb86-7a7327f02087]# ll -h total 66M -rw-rw----. 2 vdsm kvm 64M Mar 22 13:48 f5f97478-6ccb-48bc-93b7-2fd5939f40bf -rw-rw----. 2 vdsm kvm 1.0M Mar 4 12:10 f5f97478-6ccb-48bc-93b7-2fd5939f40bf.lease -rw-r--r--. 2 vdsm kvm 317 Mar 22 11:06 f5f97478-6ccb-48bc-93b7-2fd5939f40bf.meta
Please share insights on how I can reconstruct the disk image so that it can become bootable on the kvm host.
Thanks in advance for the reply.
I'd suggest to double check all the gluster logs because 64 M doesn't seem anyhow reasonable there.
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZIQNXSY5GAKY2K...

On Fri, Apr 12, 2019, 12:16 <tau@sanren.ac.za> wrote:
Adding to what me and my colleague shared
I am able to locate the disk images of the VMs, I copied some of them and tried to boot them from another standalone kvm host, however booting the disk images wasn't succesful as it landed on a rescue mode. The strange part is that the VM disk images are 64MB in size, which doesn't seem to be normal for a disk image(see the below command extract).
root@gohan 019a7072-43d5-44b5-bb86-7a7327f02087]# pwd
/gluster_bricks/data/data/659de125-5671-4777-b27e-974aec0a4c9c/images/019a7072-43d5-44b5-bb86-7a7327f02087 [root@gohan 019a7072-43d5-44b5-bb86-7a7327f02087]# ll -h total 66M -rw-rw----. 2 vdsm kvm 64M Mar 22 13:48 f5f97478-6ccb-48bc-93b7-2fd5939f40bf -rw-rw----. 2 vdsm kvm 1.0M Mar 4 12:10 f5f97478-6ccb-48bc-93b7-2fd5939f40bf.lease -rw-r--r--. 2 vdsm kvm 317 Mar 22 11:06 f5f97478-6ccb-48bc-93b7-2fd5939f40bf.meta
Id suggest to mount the gluster volume and get the disk image from there, instead directly from the brick.
Please share insights on how I can reconstruct the disk image so that it can become bootable on the kvm host.
Thanks in advance for the reply.
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZIQNXSY5GAKY2K...

It looks like i've got the exact same issue; drwxr-xr-x. 2 vdsm kvm 4.0K Mar 29 16:01 . drwxr-xr-x. 22 vdsm kvm 4.0K Mar 29 18:34 .. -rw-rw----. 1 vdsm kvm 64M Feb 4 01:32 44781cef-173a-4d84-88c5-18f7310037b4 -rw-rw----. 1 vdsm kvm 1.0M Oct 16 2018 44781cef-173a-4d84-88c5-18f7310037b4.lease -rw-r--r--. 1 vdsm kvm 311 Mar 29 16:00 44781cef-173a-4d84-88c5-18f7310037b4.meta Within the meta file the image is marked legal and reports a size of SIZE=41943040, interestingly the format is mark RAW, while it was a thinly created volume. My suspicion is that something went wrong while the volume was being livemigrated, and somehow the merging of the images broke the volume.

The data chunks are under .glusterfs folder on bricks now. Not a single huge file you can easily access from a brick. Not sure when that change was introduced though. Fil On Thu, May 9, 2019, 10:43 AM <olaf.buitelaar@gmail.com> wrote:
It looks like i've got the exact same issue; drwxr-xr-x. 2 vdsm kvm 4.0K Mar 29 16:01 . drwxr-xr-x. 22 vdsm kvm 4.0K Mar 29 18:34 .. -rw-rw----. 1 vdsm kvm 64M Feb 4 01:32 44781cef-173a-4d84-88c5-18f7310037b4 -rw-rw----. 1 vdsm kvm 1.0M Oct 16 2018 44781cef-173a-4d84-88c5-18f7310037b4.lease -rw-r--r--. 1 vdsm kvm 311 Mar 29 16:00 44781cef-173a-4d84-88c5-18f7310037b4.meta Within the meta file the image is marked legal and reports a size of SIZE=41943040, interestingly the format is mark RAW, while it was a thinly created volume. My suspicion is that something went wrong while the volume was being livemigrated, and somehow the merging of the images broke the volume. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/UDPJ2R5XUFNZR4...

This listing is from a gluster mount not from the underlying brick, which should combine all parts from the underlying .glusterfs folder. I believe when you make use of the feature.shard the files should be broken up in peaces according the shard-size. Olaf

Sorry, it wasn't clear from your post and earlier in the thread tau@ sanren.ac.za was clearly listing data on the brick, not on a mounted glusterfs volume. -- Dmitry Filonov Linux Administrator SBGrid Core | Harvard Medical School 250 Longwood Ave, SGM-114 Boston, MA 02115 On Thu, May 9, 2019 at 1:56 PM <olaf.buitelaar@gmail.com> wrote:
This listing is from a gluster mount not from the underlying brick, which should combine all parts from the underlying .glusterfs folder. I believe when you make use of the feature.shard the files should be broken up in peaces according the shard-size.
Olaf _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/JIEAGLEN67EGQT...

Hi Dimitry, Sorry for not being clearer, I've missed the part the ls was from the underlying brick. Than i've clearly a different issue. Best Olaf

Basically i want to take out all of the HDD's in the main gluster pool, and replace with SSD's. My thought was to put everything in maintenance, copy the data manually over to a transient storage server. Destroy the gluster volume, swap in all the new drives, build a new gluster volume with the same name / settings, move data back, and be done. Any thoughts on this?

On Thu, May 9, 2019, 20:57 Alex McWhirter <alex@triadic.us> wrote:
Basically i want to take out all of the HDD's in the main gluster pool, and replace with SSD's.
My thought was to put everything in maintenance, copy the data manually over to a transient storage server. Destroy the gluster volume, swap in all the new drives, build a new gluster volume with the same name / settings, move data back, and be done.
Any thoughts on this?
Seems ok if you can afford downtime. Otherwise you can go one node at a time.
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/TP6V2OYG5ZXPU5...
participants (7)
-
Alex K
-
Alex McWhirter
-
Dmitry Filonov
-
olaf.buitelaar@gmail.com
-
Sakhi Hadebe
-
Simone Tiraboschi
-
tau@sanren.ac.za