[Users] VM has paused due to unknown storage error
David Wilson
dave at dcdata.co.za
Mon Dec 3 10:51:26 EST 2012
Hi everyone,
Thank you for your responses.
Yeela:
/>When oVirt started the vm it mounted the corrupt disk image that
seemed fine, but it couldn't find the OS because of the corrupted fs,
and the error caused it to pause the guest./
To clear things up it was the /var mount that had the corrupted
filesystem, not the guest's root filesystem.
oVirt still continued to pause the guest even when I had booted the
guest off a CentOS DVD ISO, run rescue mode and manually activated the
/var logical volume. In this case I did not activate or mount the guests
root filesystem and only activated the guests /var filesystem so that I
could fsck it. fsck would run for around 5-10 minutes with the message
"Deleting orphaned inode......" and then oVirt would simply pause the
entire guest.
The only information I could find was on the physical host's vdsm.log,
which specified the following:
/libvirtEventLoop::INFO::2012-12-02
09:05:50,296::libvirtvm::1965::vm.Vm::(_onAbnormalStop)
vmId=`23b9212c-1e25-4003-aa18-b1e819bf6bb1`::abnormal vm stop device
ide0-0-1 error eother
/Perhaps there was another log I should have examined to see if more
information was provided about why oVirt was pausing the guest?
//
Shu:
This is what I did to dd the images off, and to work around the problem:
1.) On the physical host: Created an NFS mount to another temporary
Linux system that had sufficient storage for the 500GB filesystem
2.) On the physical host: Used 'dd' to dump the /var filesystem's
logical volume to an image filevia NFS on the temporary Linux system.
3.) On the temporary Linux system that now contained the filesystem
image file, I ran "qemu-img info" and noticed that thefilesystem image
was qcow2 type and specified a backing file.
4.) On the physical host: Used 'dd' to dump the logical volume specified
as a backing file, to an image file via NFS on the temporary Linux system.
5.) On the temporary Linux system: Used 'qemu-img rebase' to change the
backing file to the local copyof the backing file image.
6.) On the temporary Linux system: Used 'qemu-img commit' to commit the
changes stored in the filesystem image file to the backing file image.
7.) On the temporary Linux system: Used 'qemu-img convert' to convert
the backing file image to raw format.
8.) Onthe temporary Linux system: Used 'losetup', 'kpart'and 'fsck' to
repair the backing file image. Fsck displayed the same 'Deleting
orphaned inode....' message but managed to continue and completed ok.
9.) On the temporary Linux system: Mounted the loop filesystem and
confirmed that the data was intact and was current.
10.) In the oVirt GUI: Deactivated the faulty Virtual Disk attached to
the guest.
11.) In the oVirt GUI: Created a new 'preallocated' Virtual Disk of
sufficient sizefor the guest.
12.) On the physical host: Used 'dd' to upload the raw backing file
image from (7) to the new logical volume.
13.) I then configured the guest to boot from the CentOS DVD ISO into
rescue modeto confirm that the logical volume for the guest's /var
filesystem was accessible and mountable.
14.) Reconfigured the guest to bootfrom it's primary Virtual Disk and
started up the guest.
//
Get important Linux and industry-related news at: facebook.com/dcdata
<http://facebook.com/dcdata>
Kind regards,
David Wilson
CNS,CLS, LINUX+, CLA, DCTS, LPIC3
*LinuxTech CC t/a DcData*
CK number: 2001/058368/23
*Website:* http://www.dcdata.co.za
*Support:* +27(0)860-1-LINUX
*Mobile:* +27(0)824147413
*Tel:* +27(0)333446100
*Fax:* +27(0)866878971
On 12/03/2012 01:07 PM, Dan Kenigsberg wrote:
> On Mon, Dec 03, 2012 at 04:37:01AM -0500, Yeela Kaplan wrote:
>> Glad to hear it worked out.
>>
>> When oVirt started the vm it mounted the corrupt disk image that seemed fine,
>> but it couldn't find the OS because of the corrupted fs,
>> and the error caused it to pause the guest.
> I think we should be a bit more exact here: a VM without an installed OS
> does not pause. The cause of the pause was, most probably, and attempt
> to read from a corrupted qcow. When qemu fails to serve the guest with
> data due to a bug in the underlying storage, qemu stops and waits for
> further instructions from management. We use this feature for automatic
> lv-extend (on enospace error). But here, with eother error, a human
> intervension is required.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20121203/57e9fe31/attachment-0002.html>
More information about the Users
mailing list