[Users] SPM host in unknown status
Shu Ming
shuming at linux.vnet.ibm.com
Mon May 28 03:09:06 UTC 2012
How about your /var/log/vdsm.log in the two nodes? It seems that VDSM
got some problems.
On 2012-5-28 11:04, T-Sinjon wrote:
> 1,on node1, vdsm seems strange , it's sleeping
> [root at ovirt-node-1 ~]# systemctl status vdsmd.service
> vdsmd.service - Virtual Desktop Server Manager
> Loaded: loaded (/lib/systemd/system/vdsmd.service; enabled)
> Active: active (running) since Mon, 28 May 2012 02:43:22 +0000; 9min ago
> Process: 1157 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited,
> status=0/SUCCESS)
> Main PID: 2228 (respawn)
> CGroup: name=systemd:/system/vdsmd.service
> ? 2228 /bin/bash -e /usr/share/vdsm/respawn --minlifetime...
> ? 3573 sleep 900
> 2,no firewall blocked
> 3,network is ok, i can ssh into node1 from engine
>
> I have used the fence option (confirm host has been rebooted), but SPM
> did not changed to other node, below is the engine.log when i do this
> action:
>
> 2012-05-28 10:49:51,846 INFO
> [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand]
> (pool-5-thread-49) [72d88732] Lock Acquired to object EngineLock
> [exclusiveLocks= key:
> org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value:
> ae567034-5d8e-11e1-bdc9-a7168ad4d39f
> , sharedLocks= ]
> 2012-05-28 10:49:51,847 INFO
> [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand]
> (pool-5-thread-49) [72d88732] Running command: FenceVdsManualyCommand
> internal: false. Entities affected : ID:
> ae567034-5d8e-11e1-bdc9-a7168ad4d39f Type: VDS
> 2012-05-28 10:49:51,927 INFO
> [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand]
> (pool-5-thread-49) [72d88732] Trying to fence spm ovirt-node-1.local
> via vds ovirt-node-2.local
> 2012-05-28 10:49:51,933 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand]
> (pool-5-thread-49) [72d88732] START, FenceSpmStorageVDSCommand(vdsId =
> a522a6a6-a72e-11e1-baa3-bba876a88ef4, storagePoolId =
> 524a7003-edec-4f52-a38e-b15cadfbe3ef, prevId=1, prevLVER=17), log id:
> 530cb694
> 2012-05-28 10:49:51,965 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase]
> (pool-5-thread-49) [72d88732] Command
> org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand
> return value
> Class Name:
> org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc
> mStatus Class Name:
> org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc
> mCode 654
> mMessage Not SPM
>
>
> 2012-05-28 10:49:51,966 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase]
> (pool-5-thread-49) [72d88732] Vds: ovirt-node-2.local
> 2012-05-28 10:49:51,966 ERROR
> [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-5-thread-49)
> [72d88732] Command FenceSpmStorageVDS execution failed. Exception:
> IRSNonOperationalException: IRSGenericException: IRSErrorException:
> IRSNonOperationalException: Not SPM
> 2012-05-28 10:49:51,966 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand]
> (pool-5-thread-49) [72d88732] FINISH, FenceSpmStorageVDSCommand, log
> id: 530cb694
> 2012-05-28 10:49:51,967 WARN
> [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand]
> (pool-5-thread-49) [72d88732] Could not fence spm on vds
> ovirt-node-2.local
> 2012-05-28 10:49:51,971 ERROR
> [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand]
> (pool-5-thread-49) [72d88732] Transaction rolled-back for command:
> org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand.
> 2012-05-28 10:49:51,971 INFO
> [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand]
> (pool-5-thread-49) [72d88732] Lock freed to object EngineLock
> [exclusiveLocks= key:
> org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value:
> ae567034-5d8e-11e1-bdc9-a7168ad4d39f
> , sharedLocks= ]
> 2012-05-28 10:49:57,457 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-79) hostFromVds::selectedVds -
> ovirt-node-2.local, spmStatus Free, storage pool BLC
> 2012-05-28 10:49:57,461 ERROR
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-79) SPM Init: could not find reported vds or
> not up - pool:BLC vds_spm_id: 1
> 2012-05-28 10:49:57,466 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-79) SPM selection - vds seems as spm
> ovirt-node-1.local
> 2012-05-28 10:49:57,466 WARN
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-79) spm vds is non responsive, stopping spm
> selection.
> 2012-05-28 10:50:00,002 INFO
> [org.ovirt.engine.core.bll.AutoRecoveryManager]
> (QuartzScheduler_Worker-87) Checking autorecoverable hosts
> 2012-05-28 10:50:00,004 INFO
> [org.ovirt.engine.core.bll.AutoRecoveryManager]
> (QuartzScheduler_Worker-87) Checking autorecoverable hosts done
> 2012-05-28 10:50:00,004 INFO
> [org.ovirt.engine.core.bll.AutoRecoveryManager]
> (QuartzScheduler_Worker-87) Checking autorecoverable storage domains
> 2012-05-28 10:50:00,006 INFO
> [org.ovirt.engine.core.bll.AutoRecoveryManager]
> (QuartzScheduler_Worker-87) Checking autorecoverable storage domains done
> 2012-05-28 10:50:07,502 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-93) hostFromVds::selectedVds -
> ovirt-node-2.local, spmStatus Free, storage pool BLC
> 2012-05-28 10:50:07,505 ERROR
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-93) SPM Init: could not find reported vds or
> not up - pool:BLC vds_spm_id: 1
> 2012-05-28 10:50:07,510 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-93) SPM selection - vds seems as spm
> ovirt-node-1.local
> 2012-05-28 10:50:07,510 WARN
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-93) spm vds is non responsive, stopping spm
> selection.
> 2012-05-28 10:50:17,551 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-34) hostFromVds::selectedVds -
> ovirt-node-2.local, spmStatus Free, storage pool BLC
> 2012-05-28 10:50:17,554 ERROR
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-34) SPM Init: could not find reported vds or
> not up - pool:BLC vds_spm_id: 1
> 2012-05-28 10:50:17,559 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-34) SPM selection - vds seems as spm
> ovirt-node-1.local
> 2012-05-28 10:50:17,559 WARN
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-34) spm vds is non responsive, stopping spm
> selection.
> 2012-05-28 10:50:27,609 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-92) hostFromVds::selectedVds -
> ovirt-node-2.local, spmStatus Free, storage pool BLC
> 2012-05-28 10:50:27,612 ERROR
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-92) SPM Init: could not find reported vds or
> not up - pool:BLC vds_spm_id: 1
> 2012-05-28 10:50:27,617 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-92) SPM selection - vds seems as spm
> ovirt-node-1.local
> 2012-05-28 10:50:27,618 WARN
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-92) spm vds is non responsive, stopping spm
> selection.
> 2012-05-28 10:50:37,652 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-67) hostFromVds::selectedVds -
> ovirt-node-2.local, spmStatus Free, storage pool BLC
> 2012-05-28 10:50:37,656 ERROR
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-67) SPM Init: could not find reported vds or
> not up - pool:BLC vds_spm_id: 1
> 2012-05-28 10:50:37,661 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-67) SPM selection - vds seems as spm
> ovirt-node-1.local
> 2012-05-28 10:50:37,662 WARN
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-67) spm vds is non responsive, stopping spm
> selection.
> 2012-05-28 10:50:47,709 INFO
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-34) hostFromVds::selectedVds -
> ovirt-node-2.local, spmStatus Free, storage pool BLC
> 2012-05-28 10:50:47,712 ERROR
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (QuartzScheduler_Worker-34) SPM Init: could not find reported vds or
> not up - pool:BLC vds_spm_id: 1
>
> On 28 May, 2012, at 12:08 AM, Haim Ateya wrote:
>
>> Hi, first question that comes to mind is why host is in
>> non-responsive state?
>> Please check the following:
>> 1. vdsmd service is running on host side
>> 2. No firewall is blocking comm. in and out
>> 3. No network issue between host and manager
>>
>> Now, for your question, you can use the manual fence option (confirm
>> host has been rebooted), which will free spm role for faulty host,
>> and engine will elect new spm.
>>
>> Haim
>>
>> On May 27, 2012, at 18:32, T-Sinjon <tscbj1989 at gmail.com
>> <mailto:tscbj1989 at gmail.com>> wrote:
>>
>>> Description of problem:
>>>
>>> i have 2 nodes
>>> ovirt-node1.local Non Responsive SPM
>>> ovirt-node2.local Up None
>>>
>>> The SPM node stuck in Non-responsive status, it can't be actived,
>>> all vms in the node went into Unknown status and the master vm
>>> domain became inactived
>>>
>>> when i do "Maintenace" action to node1, it says:
>>> Error: Cannot switch Host to Maintenance mode.
>>> Host still has running VMs on it and is in Non-Responsive state.
>>>
>>> but there has no vm running in node1 , it only has 2 vms in Unknown
>>> status
>>>
>>> Because I can't active the SPM host , so i can't active the vm
>>> storage domain
>>>
>>> 1,How can i migrated the SPM role to other host in my data center ,
>>> such us node2?
>>> 2,How can i send the node1 to UP status?(I have did 'confirm the
>>> host has been Rebooted' action , and rebooted the node1, but it did
>>> no sense)
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org <mailto:Users at ovirt.org>
>>> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
--
Shu Ming<shuming at linux.vnet.ibm.com>
IBM China Systems and Technology Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20120528/a756b2f9/attachment-0001.html>
More information about the Users
mailing list