Hi again

I tried to reproduce this issue and unfortunately I couldn't. So there must be something else, but I cannot figure out what is wrong, because it is happening only with some VMs. Nonetheless I attached new log files to investigate. 

And here is some more info:
ovirt-engine-setup-3.3.2-1.el6.noarch
ovirt-engine-webadmin-portal-3.3.2-1.el6.noarch
ovirt-host-deploy-java-1.1.2-1.el6.noarch
ovirt-engine-restapi-3.3.2-1.el6.noarch
ovirt-image-uploader-3.3.1-1.el6.noarch
ovirt-release-el6-10-1.noarch
ovirt-engine-userportal-3.3.2-1.el6.noarch
ovirt-engine-backend-3.3.2-1.el6.noarch
ovirt-engine-websocket-proxy-3.3.2-1.el6.noarch
ovirt-engine-dbscripts-3.3.2-1.el6.noarch
ovirt-log-collector-3.3.1-1.el6.noarch
ovirt-engine-sdk-python-3.3.0.8-1.el6.noarch
ovirt-engine-cli-3.3.0.6-1.el6.noarch
ovirt-engine-tools-3.3.2-1.el6.noarch
ovirt-iso-uploader-3.3.1-1.el6.noarch
ovirt-engine-lib-3.3.2-1.el6.noarch
ovirt-engine-3.3.2-1.el6.noarch
ovirt-host-deploy-1.1.2-1.el6.noarch

vdsm-xmlrpc-4.13.2-1.el6.noarch
vdsm-python-4.13.2-1.el6.x86_64
vdsm-python-cpopen-4.13.2-1.el6.x86_64
vdsm-4.13.2-1.el6.x86_64
vdsm-cli-4.13.2-1.el6.noarch

libvirt-lock-sanlock-0.10.2-29.el6_5.2.x86_64
libvirt-client-0.10.2-29.el6_5.2.x86_64
libvirt-python-0.10.2-29.el6_5.2.x86_64
libvirt-0.10.2-29.el6_5.2.x86_64

CentOS 6.5 x86_64 on both Engine and Nodes.

SELinux disabled everywhere (I know, I know).
There is no any snapshots for these VMs. These VMs were installed by booting from ISO image.
There is nothing in libvirtd.log files


BR
Edgars



On Sat, Jan 18, 2014 at 6:39 PM, Dafna Ron <dron@redhat.com> wrote:
sounds like this is the issue to me... Edgars can you try to confirm that? :)
Michal, was there a bug opened for this? I think that perhaps we should also add a clear error message - it would help debug this more easily.



On 01/18/2014 05:34 PM, Michal Skrivanek wrote:

On 18 Jan 2014, at 18:06, Itamar Heim <iheim@redhat.com> wrote:

On 01/18/2014 03:44 PM, Edgars M. wrote:
Hi

Thanks for your help. I will provide all log files a little bit later,
but so far I have noticed some pattern when migration fails. Those
particular VMs, which fails to migrate, were installed from ISO image,
usual installation. So, I believe you can reproduce the issue by this:

1. Upload ISO image to ISO domain
2. Install new OS by booting from ISO image and check Attach CD in Boot
Options
3. Delete ISO image from ISO domain
4. Try to migrate VM to another host.


I have not tried this yet, but I have noticed that only those VMs fails
which had been install from ISO image which is not in ISO domain
anymore. I have also VMs created from templates and those VMs I can
migrate just fine.
did you stop the VMs post install? did you try to start them without the iso attached?
otherwise, you can't start them, as there is a missing iso for the target qemu process
Indeed. That is a known libvirt/qemu issue that even though the CD is defined as optional and VM can be started without it on original host, it fails upon migration when destination is being created.

Thanks,
michal

I will provide more log files later.

BR
Edgars


On Sat, Jan 18, 2014 at 1:05 PM, Dafna Ron <dron@redhat.com
<mailto:dron@redhat.com>> wrote:

    I looked at the logs log and only vdsm from node2 (which appears to
    be the src node) seems to have full info.

    Please attach the following complete logs:
    vdsm log from dst node
    engine log
    libvirt logs from both nodes.

    Also, can you please answer the following?
    libvirt and vdsm you are using?
    what is selinux status on both hosts (enforcing/permissive)?
    do you have snapshots on the vm or does it happen on a newly created
    disk?
    when you create the disk, is it from a template or is it a new image?

    Thanks,

    Dafna




    On 01/17/2014 10:21 PM, Itamar Heim wrote:

        On 01/18/2014 12:07 AM, Meital Bourvine wrote:

            I opened a bug about missing info in UI:
            https://bugzilla.redhat.com/__show_bug.cgi?id=1054994
            <https://bugzilla.redhat.com/show_bug.cgi?id=1054994>

            It actually failed with this error:
            Thread-1417::DEBUG::2014-01-17
            17:01:28,344::vm::768::vm.Vm::__(run)
            vmId=`b8787906-187a-4234-a0c9-__58fc4ddf2a57`::starting
            migration monitor thread
            Thread-1415::DEBUG::2014-01-17
            17:01:28,409::__libvirtconnection::108::__libvirtconnection::(wrapper)
            Unknown libvirterror: ecode: 38 edom: 42 level: 2 message:
            Failed to inquire lock: No such process
            Thread-1415::DEBUG::2014-01-17
            17:01:28,409::vm::745::vm.Vm::__(cancel)
            vmId=`b8787906-187a-4234-a0c9-__58fc4ddf2a57`::canceling
            migration downtime thread
            Thread-1415::DEBUG::2014-01-17
            17:01:28,409::vm::815::vm.Vm::__(stop)
            vmId=`b8787906-187a-4234-a0c9-__58fc4ddf2a57`::stopping
            migration monitor thread
            Thread-1416::DEBUG::2014-01-17
            17:01:28,409::vm::742::vm.Vm::__(run)
            vmId=`b8787906-187a-4234-a0c9-__58fc4ddf2a57`::migration
            downtime thread exiting
            Thread-1415::ERROR::2014-01-17
            17:01:28,410::vm::238::vm.Vm::__(_recover)
            vmId=`b8787906-187a-4234-a0c9-__58fc4ddf2a57`::Failed to
            inquire lock: No such process
            Thread-1415::ERROR::2014-01-17
            17:01:28,619::vm::337::vm.Vm::__(run)
            vmId=`b8787906-187a-4234-a0c9-__58fc4ddf2a57`::Failed to migrate
            Traceback (most recent call last):
                File "/usr/share/vdsm/vm.py", line 323, in run
                  self.___startUnderlyingMigration()
                File "/usr/share/vdsm/vm.py", line 400, in
            _startUnderlyingMigration
                  None, maxBandwidth)
                File "/usr/share/vdsm/vm.py", line 838, in f
                  ret = attr(*args, **kwargs)
                File
            "/usr/lib64/python2.6/site-__packages/vdsm/__libvirtconnection.py",
            line 76, in wrapper
                  ret = f(*args, **kwargs)
                File "/usr/lib64/python2.6/site-__packages/libvirt.py",
            line 1178, in migrateToURI2
                  if ret == -1: raise libvirtError
            ('virDomainMigrateToURI2() failed', dom=self)
            libvirtError: Failed to inquire lock: No such process
            Thread-26::ERROR::2014-01-17
            17:01:28,917::sampling::355::__vm.Vm::(collect)
            vmId=`b8787906-187a-4234-a0c9-__58fc4ddf2a57`::Stats
            function failed: <AdvancedStatsFunction _highWrite at 0x26efb58>

            The problem is that it doesn't say which process...


        this looks like noise post the migration failing, with probably
        a more relevant error in libvirt log


            ----- Original Message -----

                From: "Itamar Heim" <iheim@redhat.com
                <mailto:iheim@redhat.com>>
                To: "Edgars M." <edgars.mazurs@gmail.com
                <mailto:edgars.mazurs@gmail.com>>, "Meital Bourvine"
                <mbourvin@redhat.com <mailto:mbourvin@redhat.com>>
                Cc: users@ovirt.org <mailto:users@ovirt.org>, "Michal
                Skrivanek" <mskrivan@redhat.com
                <mailto:mskrivan@redhat.com>>
                Sent: Friday, January 17, 2014 9:47:11 PM
                Subject: Re: [Users] VM Migration failed

                On 01/17/2014 06:25 PM, Edgars M. wrote:

                    Hi Meital

                    I tried to migrate another VM and it also failed.

                    This is what I get in UI:


                    Migration started (VM: nophpapp01, Source:
                    novmnode1, Destination:
                    novmnode2, User: edgarsm).
                    Migration failed due to Error: Fatal error during
                    migration. Trying to
                    migrate to another Host (VM: nophpapp01, Source:
                    novmnode1, Destination:
                    novmnode2).
                    Migration failed due to Error: Fatal error during
                    migration (VM:
                    nophpapp01, Source: novmnode1, Destination: novmnode2).

                    There is nothing in /var/log/messages, neither on
                    engine server nor nodes.


                    See attachments for engine and vdsm logs. (I am
                    migrating from vmnode1
                    to vmnode2)


                whatever the issue is here, can you please open a bug on
                trying to
                return more info to user on the migration error itself
                (for easier
                troubleshooting).

                thanks,
                      Itamar


                    Thanks for the help
                    Edgars







                    On Fri, Jan 17, 2014 at 4:46 PM, Meital Bourvine
                    <mbourvin@redhat.com <mailto:mbourvin@redhat.com>
                    <mailto:mbourvin@redhat.com
                    <mailto:mbourvin@redhat.com>>> wrote:

                          Which error are you getting in the UI?
                          Can you please attach the full engine and vdsm
                    logs?
                          Also, please check if there is a relevant
                    error in /var/log/messages


                    ------------------------------__------------------------------__------------


                              *From: *"Edgars M."
                    <edgars.mazurs@gmail.com
                    <mailto:edgars.mazurs@gmail.com>
                              <mailto:edgars.mazurs@gmail.__com
                    <mailto:edgars.mazurs@gmail.com>>>
                              *To: *users@ovirt.org
                    <mailto:users@ovirt.org> <mailto:users@ovirt.org
                    <mailto:users@ovirt.org>>
                              *Sent: *Friday, January 17, 2014 3:42:37 PM
                              *Subject: *[Users] VM Migration failed


                              Hi

                              I am experiencing issues with manual VM
                    migration. VM fails to
                              migrate to other node in the same Cluster.
                    Here are some
                              relevant engine.log entries:

                              ERROR
                    [org.ovirt.engine.core.__vdsbroker.__VdsUpdateRunTimeInfo]
                              (DefaultQuartzScheduler___Worker-73) Rerun vm
                              a31cfd62-26fc-4396-8a83-__1aed68c7fd39.
                    Called from vds novmnode1

                              ERROR
                    [org.ovirt.engine.core.__vdsbroker.vdsbroker.__MigrateStatusVDSCommand]
                              (pool-6-thread-49) Failed in
                    MigrateStatusVDS method

                              ERROR
                    [org.ovirt.engine.core.__vdsbroker.vdsbroker.__MigrateStatusVDSCommand]
                              (pool-6-thread-49) Error code migrateErr
                    and error message
                              VDSGenericException: VDSErrorException:
                    Failed to
                              MigrateStatusVDS, error = Fatal error
                    during migration

                              ERROR
                    [org.ovirt.engine.core.__vdsbroker.vdsbroker.__MigrateStatusVDSCommand]
                              (pool-6-thread-49) Command
                    MigrateStatusVDS execution failed.
                              Exception: VDSErrorException:
                    VDSGenericException:
                              VDSErrorException: Failed to
                    MigrateStatusVDS, error = Fatal
                              error during migration

                              Both Engine and Node are running CentOS
                    6.5 x64. oVirt Engine
                              Version: 3.3.2-1.el6. VDSM version
                    4.13.2-1. I restarted engine
                              and vdsm, that did not help.

                              Also in vdsm log file I see the following
                    errors related to the
                              same VM ID:

                              Thread-27::ERROR::2014-01-17

                      16:37:06,271::sampling::355::__vm.Vm::(collect)

                      vmId=`a31cfd62-26fc-4396-8a83-__1aed68c7fd39`::Stats function
                              failed: <AdvancedStatsFunction _highWrite
                    at 0x26efb58>

                              Any hints?

                              BR
                              Edgars


                      _________________________________________________
                              Users mailing list
                    Users@ovirt.org <mailto:Users@ovirt.org>
                    <mailto:Users@ovirt.org <mailto:Users@ovirt.org>>
                    http://lists.ovirt.org/__mailman/listinfo/users
                    <http://lists.ovirt.org/mailman/listinfo/users>





                    _________________________________________________
                    Users mailing list
                    Users@ovirt.org <mailto:Users@ovirt.org>
                    http://lists.ovirt.org/__mailman/listinfo/users
                    <http://lists.ovirt.org/mailman/listinfo/users>





        _________________________________________________
        Users mailing list
        Users@ovirt.org <mailto:Users@ovirt.org>
        http://lists.ovirt.org/__mailman/listinfo/users
        <http://lists.ovirt.org/mailman/listinfo/users>



    --
    Dafna Ron




--
Dafna Ron