
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all I've rebooted a host after a kernel update (from 3.8.2 to 3.8.9), after which it refuses to go online again. Instead, it ends up as "non-operational" since the host cannot connect one of the Storage Domains. The Datacenter is of iSCSI Type, with compat version of 3.2. It currently contains 4 hosts plus the one running only the engine, each having a single software raid array which is an iSCSI target. Thus there is one storage domain for each host. All hosts run Fedora 18, the 3 other hosts are running just fine. I've tried rebooting, reinstalling, setting the host to maintenance and activate again, etc, without any change in status. Doing a 'systemctl restart vdsmd' gives no output whatsoever. Some logs/outputs I have found: Note: The interesting time is mostly May 01 from 14:30 to 15:00, but there were a number of attempts earlier that day. systemctl status vdsdm on the problematic host: May 01 14:51:59 galaxy4 vdsm[2706]: vdsm Storage.HSM WARNING disconnect sp: 4b8e315f-6d4a-432e-b5b2-cba26f5987d8 failed. Known pools {} May 01 14:51:59 galaxy4 python[2724]: vdsm SuperVdsm.ServerCallback ERROR Error in readSessionInfo May 01 14:51:59 galaxy4 python[2724]: vdsm SuperVdsm.ServerCallback ERROR Error in readSessionInfo May 01 14:51:59 galaxy4 python[2724]: vdsm SuperVdsm.ServerCallback ERROR Error in readSessionInfo May 01 14:52:21 galaxy4 rpc.statd[3098]: Version 1.2.7 starting May 01 14:52:21 galaxy4 rpc.statd[3098]: Flags: TI-RPC May 01 14:52:23 galaxy4 vdsm[2706]: vdsm Storage.StorageDomainCache ERROR Error while looking for domain `35da5e19-3465-4518-b1dc-f9c8cba6374b` May 01 14:52:23 galaxy4 vdsm[2706]: vdsm TaskManager.Task ERROR Task=`0e57636b-fe73-46b4-8962-dffd8255bfef`::Unexpected error May 01 14:52:23 galaxy4 vdsm[2706]: vdsm Storage.Dispatcher.Protect ERROR {'status': {'message': "Cannot find master domain: 'spUUID=4b8e315f-...e': 304}} vdsm.log of the problematic host and the spm is attached. Note: The SPM changed from one host to another during the time engine.log is attached as well. Also, the master domain (it's UUID) can be seen in the ovirt database. ovirt version: 3.1.0-3.26.3.el6.centos.alt Let me know of you require any additional information. Thanks for helping, Victor - -- Frank Victor Fischer, victor.fischer@tngtech.com, 49 176 100 80 60 7 TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring Geschäftsführer: Henrik Klagges, Gerhard Müller, Christoph Stock Sitz: Unterföhring, Amtsgericht München, HRB 135082 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlGBFGcACgkQxkn89gnPif9ltgCfasn44+lL8HYaBmV8eUxdzPdt eHsAnj0gG+VHSOEF7/mBtL0GfUODy2Uu =DekX -----END PGP SIGNATURE-----