[ovirt-users] Diagnosing Migration Error

Nir Soffer nsoffer at redhat.com
Fri Mar 18 19:47:20 UTC 2016


On Fri, Mar 18, 2016 at 9:37 PM, Nir Soffer <nsoffer at redhat.com> wrote:
> On Fri, Mar 18, 2016 at 9:27 PM, Nir Soffer <nsoffer at redhat.com> wrote:
>> On Fri, Mar 18, 2016 at 7:59 PM, Charles Tassell <charles at islandadmin.ca> wrote:
>>> Hi Martin,
>>>
>>>   Thanks, I checked that log and found the error " AttributeError:
>>> 'SourceThread' object has no attribute '_destServer'" which from what I see
>>> from my googling was supposedly fixed...  I'm running
>>> vdsm-4.17.23-0.el7.centos.noarch which I believe is the latest.  The full
>>> error is:
>>>
>>> Thread-2789::DEBUG::2016-03-18
>>> 14:47:04,200::blockSD::422::Storage.Misc.excCmd::(getReadDelay)
>>> /usr/bin/taskset --cpu-list 0-23 /usr/bin/dd
>>> if=/dev/383a1e7d-8b86-4a50-8bc3-e4374c57a3e5/metadata iflag=direct
>>> of=/dev/null bs=4096 count=1 (cwd None)
>>> Thread-2789::DEBUG::2016-03-18
>>> 14:47:04,211::blockSD::422::Storage.Misc.excCmd::(getReadDelay) SUCCESS:
>>> <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied,
>>> 0.000377289 s, 10.9 MB/s\n'; <rc> = 0
>>> mailbox.SPMMonitor::DEBUG::2016-03-18
>>> 14:47:04,999::storage_mailbox::735::Storage.Misc.excCmd::(_checkForMail)
>>> /usr/bin/taskset --cpu-list 0-23 dd
>>> if=/rhev/data-center/00000001-0001-0001-0001-0000000002e9/mastersd/dom_md/inbox
>>> iflag=direct,fullblock count=1 bs=1024000 (cwd None)
>>> mailbox.SPMMonitor::DEBUG::2016-03-18
>>> 14:47:05,015::storage_mailbox::735::Storage.Misc.excCmd::(_checkForMail)
>>> SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB)
>>> copied, 0.0057206 s, 179 MB/s\n'; <rc> = 0
>>> jsonrpc.Executor/0::DEBUG::2016-03-18
>>> 14:47:06,551::__init__::503::jsonrpc.JsonRpcServer::(_serveRequest) Calling
>>> 'VM.migrate' in bridge with {u'params': {u'tunneled': u'false',
>>> u'autoConverge': u'false', u'src': u'ovirt-01.virt.roblib.upei.ca', u'dst':
>>> u'ovirt-02.virt.roblib.upei.ca:54321', u'vmId':
>>> u'2be4938e-f4a3-4322-bae3-8a9628b81835', u'abortOnError': u'true',
>>> u'compressed': u'false', u'method': u'online'}, u'vmID':
>>> u'2be4938e-f4a3-4322-bae3-8a9628b81835'}
>>> jsonrpc.Executor/0::DEBUG::2016-03-18 14:47:06,552::API::546::vds::(migrate)
>>> {u'tunneled': u'false', u'autoConverge': u'false', u'src':
>>> u'ovirt-01.virt.roblib.upei.ca', u'dst':
>>> u'ovirt-02.virt.roblib.upei.ca:54321', u'vmId':
>>> u'2be4938e-f4a3-4322-bae3-8a9628b81835', u'abortOnError': u'true',
>>> u'compressed': u'false', u'method': u'online'}
>>> jsonrpc.Executor/0::DEBUG::2016-03-18
>>> 14:47:06,555::__init__::206::jsonrpc.Notification::(emit) Sending event
>>> {"params": {"2be4938e-f4a3-4322-bae3-8a9628b81835": {"status": "Migration
>>> Source"}, "notify_time": 4311153220}, "jsonrpc": "2.0", "method":
>>> "|virt|VM_status|2be4938e-f4a3-4322-bae3-8a9628b81835"}
>>> jsonrpc.Executor/0::DEBUG::2016-03-18
>>> 14:47:06,555::__init__::533::jsonrpc.JsonRpcServer::(_serveRequest) Return
>>> 'VM.migrate' in bridge with True
>>> Thread-41096::ERROR::2016-03-18
>>> 14:47:06,559::migration::208::virt.vm::(_recover)
>>> vmId=`2be4938e-f4a3-4322-bae3-8a9628b81835`::[Errno -2] Name or service not
>>> known
>>> Thread-41096::ERROR::2016-03-18
>>> 14:47:06,559::migration::213::virt.vm::(_recover)
>>> vmId=`2be4938e-f4a3-4322-bae3-8a9628b81835`::Failed to destroy remote VM
>>> Traceback (most recent call last):
>>>   File "/usr/share/vdsm/virt/migration.py", line 211, in _recover
>>>     self._destServer.destroy(self._vm.id)
>>> AttributeError: 'SourceThread' object has no attribute '_destServer'
>>> Thread-41096::DEBUG::2016-03-18
>>> 14:47:06,560::__init__::206::jsonrpc.Notification::(emit) Sending event
>>> {"params": {"2be4938e-f4a3-4322-bae3-8a9628b81835": {"status": "Migration
>>> Source"}, "notify_time": 4311153220}, "jsonrpc": "2.0", "method":
>>> "|virt|VM_status|2be4938e-f4a3-4322-bae3-8a9628b81835"}
>>> Thread-41096::ERROR::2016-03-18 14:47:06,560::migration::309::virt.vm::(run)
>>> vmId=`2be4938e-f4a3-4322-bae3-8a9628b81835`::Failed to migrate
>>> Traceback (most recent call last):
>>>   File "/usr/share/vdsm/virt/migration.py", line 278, in run
>>>     self._setupVdsConnection()
>>>   File "/usr/share/vdsm/virt/migration.py", line 143, in _setupVdsConnection
>>>     client = self._createClient(port)
>>>   File "/usr/share/vdsm/virt/migration.py", line 130, in _createClient
>>>     self.remoteHost, int(port), sslctx)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 1269, in
>>> create_connected_socket
>>>     sock.connect((host, port))
>>>   File "/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py", line
>>> 181, in connect
>>>     self.socket.connect(addr)
>>>   File "/usr/lib64/python2.7/socket.py", line 224, in meth
>>>     return getattr(self._sock,name)(*args)
>>> gaierror: [Errno -2] Name or service not known
>>
>> This looks like a bug in the current code (looking in 4.0 development code).
>
> Correction, this is fixed in 4.0, but broken in latest 3.6 (v4.17.24).
>
> This issue was fixed in https://gerrit.ovirt.org/52897, but not backported
> to 3.6.

Update: there is patch here if you want to test it:
http://gerrit.ovirt.org/54949

It will not fix the name solution error but at least remove the noise
about the non
existing variable.

Nir

>
>>
>> 1. We try to connect to the remote host:
>> 2. Connectioin fails with DNS resolving error
>> 3. self._destServer is not initialized becuase of 2
>> 4. Code in _recover fail when access self._destServer
>>
>> We should fix the initialization to avoid this error, but it will fix
>> your issue, which is
>> DNS error.
>>
>> Maybe the address of the destination vm is wrong, or you have DNS issues in your
>> environment?
>>
>> Adding Francesco.
>>
>> Nir
>>
>>>
>>>
>>> On 16-03-18 01:22 PM, Martin Sivak wrote:
>>>>
>>>> Hi Charles,
>>>>
>>>> you can probably get much more detailed explanation from the vdsm and
>>>> libvirt logs from the two hosts. Take a look at
>>>> /var/log/vdsm/vdsm.log.
>>>>
>>>> Best regards
>>>>
>>>> Martin Sivak
>>>> SLA / oVirt
>>>>
>>>>
>>>> On Fri, Mar 18, 2016 at 3:37 PM, Charles Tassell <charles at islandadmin.ca>
>>>> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>>    I'm trying to test migrating a VM from one host to another and it's
>>>>> failing, but with no apparent error message.  I took a look at the server
>>>>> logs (which I've copied below) but there doesn't seem to be anything in
>>>>> here
>>>>> that would help track down the problem.  Should I be looking somewhere
>>>>> besides engine.log on the hosted engine? This is the latest version of
>>>>> oVirt
>>>>> (3.6.3.4-1.el7.centos) that I just upgraded this morning.  The VM has an
>>>>> ISO
>>>>> attached over NFS storage and the disk image is on fiber channel.
>>>>>
>>>>> 2016-03-18 14:31:18,913 INFO
>>>>> [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-25)
>>>>> [549545f2] Lock Acquired to object
>>>>> 'EngineLock:{exclusiveLocks='[2be4938e-f4a3-4322-bae3-8a9628b81835=<VM,
>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED$VmName cdTest02>]',
>>>>> sharedLocks='null'}'
>>>>> 2016-03-18 14:31:19,131 INFO
>>>>> [org.ovirt.engine.core.bll.MigrateVmToServerCommand]
>>>>> (org.ovirt.thread.pool-8-thread-45) [549545f2] Running command:
>>>>> MigrateVmToServerCommand internal: false. Entities affected :  ID:
>>>>> 2be4938e-f4a3-4322-bae3-8a9628b81835 Type: VMAction group MIGRATE_VM with
>>>>> role type USER
>>>>> 2016-03-18 14:31:19,186 INFO
>>>>> [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand]
>>>>> (org.ovirt.thread.pool-8-thread-45) [549545f2] START, MigrateVDSCommand(
>>>>> MigrateVDSCommandParameters:{runAsync='true',
>>>>> hostId='cbfd733b-8ced-487d-8754-a2217ce1210f',
>>>>> vmId='2be4938e-f4a3-4322-bae3-8a9628b81835',
>>>>> srcHost='ovirt-01.virt.roblib.upei.ca',
>>>>> dstVdsId='1200a78f-6d05-4e5e-9ef7-6798cf741310',
>>>>> dstHost='ovirt-02.virt.roblib.upei.ca:54321', migrationMethod='ONLINE',
>>>>> tunnelMigration='false', migrationDowntime='0', autoConverge='false',
>>>>> migrateCompressed='false', consoleAddress='null'}), log id: 8f303ba
>>>>> 2016-03-18 14:31:19,188 INFO
>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]
>>>>> (org.ovirt.thread.pool-8-thread-45) [549545f2] START,
>>>>> MigrateBrokerVDSCommand(HostName = oVirt-01,
>>>>> MigrateVDSCommandParameters:{runAsync='true',
>>>>> hostId='cbfd733b-8ced-487d-8754-a2217ce1210f',
>>>>> vmId='2be4938e-f4a3-4322-bae3-8a9628b81835',
>>>>> srcHost='ovirt-01.virt.roblib.upei.ca',
>>>>> dstVdsId='1200a78f-6d05-4e5e-9ef7-6798cf741310',
>>>>> dstHost='ovirt-02.virt.roblib.upei.ca:54321', migrationMethod='ONLINE',
>>>>> tunnelMigration='false', migrationDowntime='0', autoConverge='false',
>>>>> migrateCompressed='false', consoleAddress='null'}), log id: 792d7e9e
>>>>> 2016-03-18 14:31:19,455 INFO
>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand]
>>>>> (org.ovirt.thread.pool-8-thread-45) [549545f2] FINISH,
>>>>> MigrateBrokerVDSCommand, log id: 792d7e9e
>>>>> 2016-03-18 14:31:19,470 INFO
>>>>> [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand]
>>>>> (org.ovirt.thread.pool-8-thread-45) [549545f2] FINISH, MigrateVDSCommand,
>>>>> return: MigratingFrom, log id: 8f303ba
>>>>> 2016-03-18 14:31:19,498 INFO
>>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>>>> (org.ovirt.thread.pool-8-thread-45) [549545f2] Correlation ID: 549545f2,
>>>>> Job
>>>>> ID: 3f97f737-d32a-4ffa-b213-122dcd3ca048, Call Stack: null, Custom Event
>>>>> ID:
>>>>> -1, Message: Migration started (VM: cdTest02, Source: oVirt-01,
>>>>> Destination:
>>>>> oVirt-02, User: admin at internal).
>>>>> 2016-03-18 14:31:28,055 INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer]
>>>>> (DefaultQuartzScheduler_Worker-39) [] VM
>>>>> '2be4938e-f4a3-4322-bae3-8a9628b81835'(cdTest02) moved from
>>>>> 'MigratingFrom'
>>>>> --> 'Up'
>>>>> 2016-03-18 14:31:28,055 INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer]
>>>>> (DefaultQuartzScheduler_Worker-39) [] Adding VM
>>>>> '2be4938e-f4a3-4322-bae3-8a9628b81835' to re-run list
>>>>> 2016-03-18 14:31:28,114 ERROR
>>>>> [org.ovirt.engine.core.vdsbroker.VmsMonitoring]
>>>>> (DefaultQuartzScheduler_Worker-39) [] Rerun VM
>>>>> '2be4938e-f4a3-4322-bae3-8a9628b81835'. Called from VDS 'oVirt-01'
>>>>> 2016-03-18 14:31:28,202 INFO
>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand]
>>>>> (org.ovirt.thread.pool-8-thread-46) [] START,
>>>>> MigrateStatusVDSCommand(HostName = oVirt-01,
>>>>> MigrateStatusVDSCommandParameters:{runAsync='true',
>>>>> hostId='cbfd733b-8ced-487d-8754-a2217ce1210f',
>>>>> vmId='2be4938e-f4a3-4322-bae3-8a9628b81835'}), log id: 618baaa6
>>>>> 2016-03-18 14:31:29,209 INFO
>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand]
>>>>> (org.ovirt.thread.pool-8-thread-46) [] FINISH, MigrateStatusVDSCommand,
>>>>> log
>>>>> id: 618baaa6
>>>>> 2016-03-18 14:31:29,250 ERROR
>>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>>>> (org.ovirt.thread.pool-8-thread-46) [] Correlation ID: 549545f2, Job ID:
>>>>> 3f97f737-d32a-4ffa-b213-122dcd3ca048, Call Stack: null, Custom Event ID:
>>>>> -1,
>>>>> Message: Migration failed  (VM: cdTest02, Source: oVirt-01, Destination:
>>>>> oVirt-02).
>>>>> 2016-03-18 14:31:29,262 INFO
>>>>> [org.ovirt.engine.core.bll.MigrateVmToServerCommand]
>>>>> (org.ovirt.thread.pool-8-thread-46) [] Lock freed to object
>>>>> 'EngineLock:{exclusiveLocks='[2be4938e-f4a3-4322-bae3-8a9628b81835=<VM,
>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED$VmName cdTest02>]',
>>>>> sharedLocks='null'}'
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users



More information about the Users mailing list