<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div>1,on node1, vdsm seems strange , it's sleeping</div><div>[root@ovirt-node-1 ~]# systemctl status vdsmd.service</div><div>vdsmd.service - Virtual Desktop Server Manager</div><div><span class="Apple-tab-span" style="white-space:pre">        </span> Loaded: loaded (/lib/systemd/system/vdsmd.service; enabled)</div><div><span class="Apple-tab-span" style="white-space:pre">        </span> Active: active (running) since Mon, 28 May 2012 02:43:22 +0000; 9min ago</div><div><span class="Apple-tab-span" style="white-space:pre">        </span> Process: 1157 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited, status=0/SUCCESS)</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>Main PID: 2228 (respawn)</div><div><span class="Apple-tab-span" style="white-space:pre">        </span> CGroup: name=systemd:/system/vdsmd.service</div><div><span class="Apple-tab-span" style="white-space:pre">                </span> ├ 2228 /bin/bash -e /usr/share/vdsm/respawn --minlifetime...</div><div><span class="Apple-tab-span" style="white-space:pre">                </span> └ 3573 sleep 900</div><div>2,no firewall blocked</div><div>3,network is ok, i can ssh into node1 from engine</div><div><br></div><div>I have used the fence option (confirm host has been rebooted), but SPM did not changed to other node, below is the engine.log when i do this action:</div><div><br></div><div>2012-05-28 10:49:51,846 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Lock Acquired to object EngineLock [exclusiveLocks= key: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: ae567034-5d8e-11e1-bdc9-a7168ad4d39f</div><div>, sharedLocks= ]</div><div>2012-05-28 10:49:51,847 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Running command: FenceVdsManualyCommand internal: false. Entities affected : ID: ae567034-5d8e-11e1-bdc9-a7168ad4d39f Type: VDS</div><div>2012-05-28 10:49:51,927 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Trying to fence spm ovirt-node-1.local via vds ovirt-node-2.local</div><div>2012-05-28 10:49:51,933 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] (pool-5-thread-49) [72d88732] START, FenceSpmStorageVDSCommand(vdsId = a522a6a6-a72e-11e1-baa3-bba876a88ef4, storagePoolId = 524a7003-edec-4f52-a38e-b15cadfbe3ef, prevId=1, prevLVER=17), log id: 530cb694</div><div>2012-05-28 10:49:51,965 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-5-thread-49) [72d88732] Command org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand return value </div><div> Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc</div><div>mStatus Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc</div><div>mCode 654</div><div>mMessage Not SPM</div><div><br></div><div><br></div><div>2012-05-28 10:49:51,966 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-5-thread-49) [72d88732] Vds: ovirt-node-2.local</div><div>2012-05-28 10:49:51,966 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-5-thread-49) [72d88732] Command FenceSpmStorageVDS execution failed. Exception: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM</div><div>2012-05-28 10:49:51,966 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] (pool-5-thread-49) [72d88732] FINISH, FenceSpmStorageVDSCommand, log id: 530cb694</div><div>2012-05-28 10:49:51,967 WARN [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Could not fence spm on vds ovirt-node-2.local</div><div>2012-05-28 10:49:51,971 ERROR [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Transaction rolled-back for command: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand.</div><div>2012-05-28 10:49:51,971 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-49) [72d88732] Lock freed to object EngineLock [exclusiveLocks= key: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: ae567034-5d8e-11e1-bdc9-a7168ad4d39f</div><div>, sharedLocks= ]</div><div>2012-05-28 10:49:57,457 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC</div><div>2012-05-28 10:49:57,461 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1</div><div>2012-05-28 10:49:57,466 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) SPM selection - vds seems as spm ovirt-node-1.local</div><div>2012-05-28 10:49:57,466 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) spm vds is non responsive, stopping spm selection.</div></div><div>2012-05-28 10:50:00,002 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable hosts</div><div>2012-05-28 10:50:00,004 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable hosts done</div><div>2012-05-28 10:50:00,004 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable storage domains</div><div>2012-05-28 10:50:00,006 INFO [org.ovirt.engine.core.bll.AutoRecoveryManager] (QuartzScheduler_Worker-87) Checking autorecoverable storage domains done</div><div>2012-05-28 10:50:07,502 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-93) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC</div><div>2012-05-28 10:50:07,505 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-93) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1</div><div>2012-05-28 10:50:07,510 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-93) SPM selection - vds seems as spm ovirt-node-1.local</div><div>2012-05-28 10:50:07,510 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-93) spm vds is non responsive, stopping spm selection.</div><div>2012-05-28 10:50:17,551 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-34) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC</div><div>2012-05-28 10:50:17,554 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-34) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1</div><div>2012-05-28 10:50:17,559 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-34) SPM selection - vds seems as spm ovirt-node-1.local</div><div>2012-05-28 10:50:17,559 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-34) spm vds is non responsive, stopping spm selection.</div><div>2012-05-28 10:50:27,609 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-92) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC</div><div>2012-05-28 10:50:27,612 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-92) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1</div><div>2012-05-28 10:50:27,617 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-92) SPM selection - vds seems as spm ovirt-node-1.local</div><div>2012-05-28 10:50:27,618 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-92) spm vds is non responsive, stopping spm selection.</div><div>2012-05-28 10:50:37,652 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-67) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC</div><div>2012-05-28 10:50:37,656 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-67) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1</div><div>2012-05-28 10:50:37,661 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-67) SPM selection - vds seems as spm ovirt-node-1.local</div><div>2012-05-28 10:50:37,662 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-67) spm vds is non responsive, stopping spm selection.</div><div>2012-05-28 10:50:47,709 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-34) hostFromVds::selectedVds - ovirt-node-2.local, spmStatus Free, storage pool BLC</div><div>2012-05-28 10:50:47,712 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-34) SPM Init: could not find reported vds or not up - pool:BLC vds_spm_id: 1</div><div><br></div><div><div>On 28 May, 2012, at 12:08 AM, Haim Ateya wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div bgcolor="#FFFFFF"><div><div style="text-align: left;direction: ltr; ">Hi, first question that comes to mind is why host is in non-responsive state? </div><div style="text-align: left;direction: ltr; ">Please check the following:</div><div style="text-align: left;direction: ltr; ">1. vdsmd service is running on host side</div><div style="text-align: left;direction: ltr; ">2. No firewall is blocking comm. in and out</div><div style="text-align: left;direction: ltr; ">3. No network issue between host and manager</div><div style="text-align: left;direction: ltr; "><br></div><div style="text-align: left;direction: ltr; ">Now, for your question, you can use the manual fence option (confirm host has been rebooted), which will free spm role for faulty host, and engine will elect new spm.</div><div style="text-align: left;direction: ltr; "><br></div>Haim</div><div><br>On May 27, 2012, at 18:32, T-Sinjon <<a href="mailto:tscbj1989@gmail.com">tscbj1989@gmail.com</a>> wrote:<br><br></div><div></div><blockquote type="cite"><div><span>Description of problem:</span><br><span></span><br><span>i have 2 nodes </span><br><span>ovirt-node1.local Non Responsive SPM</span><br><span>ovirt-node2.local Up None</span><br><span></span><br><span>The SPM node stuck in Non-responsive status, it can't be actived, </span><br><span>all vms in the node went into Unknown status and the master vm domain became inactived</span><br><span></span><br><span>when i do "Maintenace" action to node1, it says:</span><br><span>Error: Cannot switch Host to Maintenance mode.</span><br><span>Host still has running VMs on it and is in Non-Responsive state.</span><br><span></span><br><span>but there has no vm running in node1 , it only has 2 vms in Unknown status</span><br><span></span><br><span>Because I can't active the SPM host , so i can't active the vm storage domain</span><br><span></span><br><span>1,How can i migrated the SPM role to other host in my data center , such us node2?</span><br><span>2,How can i send the node1 to UP status?(I have did 'confirm the host has been Rebooted' action , and rebooted the node1, but it did no sense)</span><br><span></span><br><span>_______________________________________________</span><br><span>Users mailing list</span><br><span><a href="mailto:Users@ovirt.org">Users@ovirt.org</a></span><br><span><a href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a></span><br></div></blockquote></div></blockquote></div><br></body></html>