Hi David,
You can also check the libvirt log.
Please send me the vdsm and libvirt logs so I can look at it too.
thanks.
----- Original Message -----
From: "David Wilson" <dave(a)dcdata.co.za>
To: "Dan Kenigsberg" <danken(a)redhat.com>
Cc: "Yeela Kaplan" <ykaplan(a)redhat.com>, users(a)ovirt.org
Sent: Monday, December 3, 2012 5:51:26 PM
Subject: Re: [Users] VM has paused due to unknown storage error
Hi everyone,
Thank you for your responses.
Yeela:
>When oVirt started the vm it mounted the corrupt disk image that
>seemed fine, but it couldn't find the OS because of the corrupted
>fs, and the error caused it to pause the guest.
To clear things up it was the /var mount that had the corrupted
filesystem, not the guest's root filesystem.
oVirt still continued to pause the guest even when I had booted the
guest off a CentOS DVD ISO, run rescue mode and manually activated
the /var logical volume. In this case I did not activate or mount
the guests root filesystem and only activated the guests /var
filesystem so that I could fsck it. fsck would run for around 5-10
minutes with the message "Deleting orphaned inode......" and then
oVirt would simply pause the entire guest.
The only information I could find was on the physical host's
vdsm.log, which specified the following:
libvirtEventLoop:: INFO::2012-12-02
09:05:50,296::libvirtvm::1965::vm.Vm::(_onAbnormalStop)
vmId=`23b9212c-1e25-4003-aa18-b1e819bf6bb1`::abnormal vm stop device
ide0-0-1 error eother
Perh aps there was another log I should have examined to see if more
i nforma tion was provided about wh y oVirt was pausing the guest?
Shu:
This is what I did to dd the images off, and to work around the
problem :
1.) O n the physical host : C reated an NFS mount to another
temporary Linux system that had sufficient storage for the 500GB
filesystem
2. ) O n the physical host : U sed 'dd' to dump the /var filesystem's
logical vol ume to an image file via NFS on t he temporar y Linux
system .
3.) On the temporary Linux s ystem that now contained the filesystem
image file, I ran "qemu-img info " and noticed that the fi lesystem
image was qc ow2 type and specified a ba cking file.
4.) On t he physical host : Used 'dd' to dump the lo gical volume
specified as a backing file, to an image file via NFS on t he
temporar y Linux system.
5. ) On the temporary Linux system: Used ' qemu-img reba se ' to
change the backing file to the local copy of the back ing file
image.
6.) On the temporary Linux system: Used 'qemu-img commit' to commit
the changes stored in the filesystem image file to the backing file
image.
7.) O n the temporary Linux sy ste m : U sed 'qemu-img convert' to
convert the backing file image to raw format.
8.) On the temporary Linux system: Used 'losetup', 'kpart' and
'fsck'
to repair the backing file image. Fsck displayed t he same 'Del
eting orphaned in ode ....' message but managed to con tinue and
completed ok.
9.) O n the tempo rary Linux system: Mo unted the loop filesystem and
confirmed that the data was intact and was current.
10 .) I n the oVirt GUI: Deactivated the faulty Virtual D isk
attached to the guest.
11 .) In the oVirt GUI: Created a new 'preallocated ' Virtual Disk of
sufficient size for the guest.
1 2 .) O n the physical host: Used 'dd' to upload the raw ba cking
file image from (7) to the new lo gical volume.
1 3 .) I then conf igured the guest to boot from the CentOS D VD ISO
into res cue mode to confirm that the lo gical volume for th e guest
's /var filesystem was accessi ble and mountable.
1 4 .) Reconfigured the guest to boo t from it's primary Virtual Disk
and sta rted up the guest.
Get important Linux and industry-related news at:
facebook.com/dcdata
Kind regards,
David Wilson
CNS,CLS, LINUX+, CLA, DCTS, LPIC3
LinuxTech CC t/a DcData
CK number: 2001/058368/23
Website:
http://www.dcdata.co.za
Support: +27(0)860-1-LINUX
Mobile: +27(0)824147413
Tel: +27(0)333446100
Fax: +27(0)866878971 On 12/03/2012 01:07 PM, Dan Kenigsberg wrote:
On Mon, Dec 03, 2012 at 04:37:01AM -0500, Yeela Kaplan wrote:
Glad to hear it worked out.
When oVirt started the vm it mounted the corrupt disk image that
seemed fine,
but it couldn't find the OS because of the corrupted fs,
and the error caused it to pause the guest. I think we should be a
bit more exact here: a VM without an installed OS
does not pause. The cause of the pause was, most probably, and
attempt
to read from a corrupted qcow. When qemu fails to serve the guest
with
data due to a bug in the underlying storage, qemu stops and waits for
further instructions from management. We use this feature for
automatic
lv-extend (on enospace error). But here, with eother error, a human
intervension is required.
Time to evaluate your email security provider? Watch the video and
take advantage of Mimecast’s first ever limited promotion.