[Users] SPM host in unknown status

T-Sinjon tscbj1989 at gmail.com
Mon May 28 03:04:06 UTC 2012


1,on node1, vdsm seems strange , it's sleeping
[root at ovirt-node-1 ~]# systemctl status vdsmd.service
vdsmd.service - Virtual Desktop Server Manager
	  Loaded: loaded (/lib/systemd/system/vdsmd.service; enabled)
	  Active: active (running) since Mon, 28 May 2012 02:43:22 +0000; 9min ago
	 Process: 1157 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited, status=0/SUCCESS)
	Main PID: 2228 (respawn)
	  CGroup: name=systemd:/system/vdsmd.service
		  ├ 2228 /bin/bash -e /usr/share/vdsm/respawn --minlifetime...
		  └ 3573 sleep 900
2,no firewall blocked
3,network is ok, i can ssh into node1 from engine

I have used the fence option (confirm host has been rebooted), but SPM did not changed to other node, below is the engine.log when i do this action:

2012-05-28 10:49:51,846 INFO  [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Lock Acquired to object EngineLock [exclusiveLocks= key: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: ae567034-5d8e-11e1-bdc9-a7168ad4d39f
, sharedLocks= ]
2012-05-28 10:49:51,847 INFO  [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Running command: FenceVdsManualyCommand internal: false. Entities affected :  ID: ae567034-5d8e-11e1-bdc9-a7168ad4d39f Type: VDS
2012-05-28 10:49:51,927 INFO  [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Trying to fence spm ovirt-node-1.local via vds ovirt-node-2.local
2012-05-28 10:49:51,933 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] (pool-5-thread-49) [72d88732] START, FenceSpmStorageVDSCommand(vdsId = a522a6a6-a72e-11e1-baa3-bba876a88ef4, storagePoolId = 524a7003-edec-4f52-a38e-b15cadfbe3ef, prevId=1, prevLVER=17), log id: 530cb694
2012-05-28 10:49:51,965 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-5-thread-49) [72d88732] Command org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand return value 
 Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc
mStatus                       Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc
mCode                         654
mMessage                      Not SPM


2012-05-28 10:49:51,966 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-5-thread-49) [72d88732] Vds: ovirt-node-2.local
2012-05-28 10:49:51,966 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-5-thread-49) [72d88732] Command FenceSpmStorageVDS execution failed. Exception: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM
2012-05-28 10:49:51,966 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] (pool-5-thread-49) [72d88732] FINISH, FenceSpmStorageVDSCommand, log id: 530cb694
2012-05-28 10:49:51,967 WARN  [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Could not fence spm on vds ovirt-node-2.local
2012-05-28 10:49:51,971 ERROR [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Transaction rolled-back for command: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand.
2012-05-28 10:49:51,971 INFO  [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Lock freed to object EngineLock [exclusiveLocks= key: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: ae567034-5d8e-11e1-bdc9-a7168ad4d39f
, sharedLocks= ]
2012-05-28 10:49:57,457 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:49:57,461 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1
2012-05-28 10:49:57,466 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) SPM selection - vds seems as spm ovirt-node-1.local
2012-05-28 10:49:57,466 WARN  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) spm vds is non responsive, stopping spm selection.
2012-05-28 10:50:00,002 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable hosts
2012-05-28 10:50:00,004 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable hosts done
2012-05-28 10:50:00,004 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable storage domains
2012-05-28 10:50:00,006 INFO  [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable storage domains done
2012-05-28 10:50:07,502 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-93) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:50:07,505 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-93) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1
2012-05-28 10:50:07,510 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-93) SPM selection - vds seems as spm ovirt-node-1.local
2012-05-28 10:50:07,510 WARN  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-93) spm vds is non responsive, stopping spm selection.
2012-05-28 10:50:17,551 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-34) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:50:17,554 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-34) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1
2012-05-28 10:50:17,559 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-34) SPM selection - vds seems as spm ovirt-node-1.local
2012-05-28 10:50:17,559 WARN  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-34) spm vds is non responsive, stopping spm selection.
2012-05-28 10:50:27,609 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-92) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:50:27,612 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-92) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1
2012-05-28 10:50:27,617 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-92) SPM selection - vds seems as spm ovirt-node-1.local
2012-05-28 10:50:27,618 WARN  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-92) spm vds is non responsive, stopping spm selection.
2012-05-28 10:50:37,652 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-67) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:50:37,656 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-67) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1
2012-05-28 10:50:37,661 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-67) SPM selection - vds seems as spm ovirt-node-1.local
2012-05-28 10:50:37,662 WARN  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-67) spm vds is non responsive, stopping spm selection.
2012-05-28 10:50:47,709 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-34) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC
2012-05-28 10:50:47,712 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-34) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1

On 28 May, 2012, at 12:08 AM, Haim Ateya wrote:

> Hi, first question that comes to mind is why host is in non-responsive state? 
> Please check the following:
> 1. vdsmd service is running on host side
> 2. No firewall is blocking comm. in and out
> 3. No network issue between host and manager
> 
> Now, for your question, you can use the  manual fence option (confirm host has been rebooted), which will free spm role for faulty host, and engine will elect new spm.
> 
> Haim
> 
> On May 27, 2012, at 18:32, T-Sinjon <tscbj1989 at gmail.com> wrote:
> 
>> Description of problem:
>> 
>> i have 2 nodes 
>> ovirt-node1.local            Non Responsive        SPM
>> ovirt-node2.local            Up                None
>> 
>> The SPM node stuck in Non-responsive status, it can't be actived, 
>> all vms in the node went into Unknown status and the master vm domain became inactived
>> 
>> when i do "Maintenace" action to node1, it says:
>> Error: Cannot switch Host to Maintenance mode.
>> Host still has running VMs on it and is in Non-Responsive state.
>> 
>> but there has no vm running in node1 , it only has 2 vms in Unknown status
>> 
>> Because I can't active the SPM host , so  i can't active  the vm storage domain
>> 
>> 1,How can i migrated the SPM role to other host in my data center , such us node2?
>> 2,How can i send the node1 to UP status?(I have did 'confirm the host has been Rebooted' action , and rebooted the node1, but it did no sense)
>> 
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20120528/881f369b/attachment-0001.html>


More information about the Users mailing list