[Users] Host stuck in unresponsive state

René Koch r.koch at ovido.at
Sun Sep 1 17:55:42 UTC 2013


Hi Frank,

I sometimes have (had) the same issues with all-in-one-setups, so I don't use local storage in all-in-one-setup anymore.
Instead I share a directory on my node via NFS, create a new NFS datacenter and mount it locally.
This might now the best way to do it, but I have better experience with this setup as with local storage.

Btw, when changing multipath.conf make sure you set "RHEV PRIVATE" below "RHEV REVISION X.Y" to avoid losing your changes during next reboot.

With iSCSI and FC backends vdsm is working fine in combination with multipath. In such setups multipath absolutely makes sense, but I also don't understand why multipathing is used for local storage - disks are controlled by a (hardware) raid controller and there's no alternate path oVirt could use in case of storage loss or for better throughput...


Regards,
René

 
 
-----Original message-----
> From:Frank Wall <fw at moov.de>
> Sent: Sunday 1st September 2013 16:40
> To: users at ovirt.org
> Subject: Re: [Users] Host stuck in unresponsive state
> 
> On 01.09.2013 01:28, Frank Wall wrote:
> > OK, for some reason it got stuck trying to start "iscsid" and
> > "multipathd". I was able to solve the issues with these services and
> > now the real error message is visible:
> 
> Did some more fiddling... I removed my /etc/multipath.conf and started 
> with the new file. Apparently there is a syntax error in this 
> auto-generated config:
> 
> [root at aio ~]# multipath -ll
> Sep 01 00:32:27 | multipath.conf +5, invalid keyword: getuid_callout
> Sep 01 00:32:27 | multipath.conf +18, invalid keyword: getuid_callout
> 
> OK, I removed lines 5 and 18 and now multipathd is working again. This 
> time it was possible to successfully start vdsmd afterwards:
> 
> [root at aio ~]# systemctl status vdsmd.service
> vdsmd.service - Virtual Desktop Server Manager
>     Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
>     Active: active (running) since So 2013-09-01 16:25:45 CEST; 1min 30s 
> ago
>    Process: 3138 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited, 
> status=0/SUCCESS)
>   Main PID: 3285 (respawn)
>     CGroup: name=systemd:/system/vdsmd.service
>             ├─3285 /bin/bash -e /usr/share/vdsm/respawn --minlifetime 10 
> --daemon --masterpid /var/run/vdsm/respawn.pid /us...
>             └─3288 /usr/bin/python /usr/share/vdsm/vdsm
> 
> Sep 01 16:25:45 aio.exmaple.com python[3288]: DIGEST-MD5 client step 2
> Sep 01 16:25:45 aio.exmaple.com python[3288]: DIGEST-MD5 
> parse_server_challenge()
> Sep 01 16:25:45 aio.exmaple.com python[3288]: DIGEST-MD5 ask_user_info()
> Sep 01 16:25:45 aio.exmaple.com vdsm[3288]: vdsm vds WARNING Unable to 
> load the json rpc server module. Please make su...alled.
> Sep 01 16:25:45 aio.exmaple.com python[3288]: DIGEST-MD5 client step 2
> Sep 01 16:25:45 aio.exmaple.com python[3288]: DIGEST-MD5 ask_user_info()
> Sep 01 16:25:45 aio.exmaple.com python[3288]: DIGEST-MD5 
> make_client_response()
> Sep 01 16:25:45 aio.exmaple.com python[3288]: DIGEST-MD5 client step 3
> Sep 01 16:25:54 aio.exmaple.com vdsm[3288]: vdsm TaskManager.Task ERROR 
> Task=`7fc3840c-1518-4260-9f27-ee20434b5a7a`::U... error
> Sep 01 16:25:54 aio.exmaple.com vdsm[3288]: vdsm TaskManager.Task ERROR 
> Task=`82f757b5-a669-40fa-b09d-9cad90c971e1`::U... error
> 
> 
> Still, this doesn't feel right. I think vdsmd is just too unstable and 
> vulnerable. Why did vdsmd core dump with another multipathd config in 
> place? Why does it even have this strict dependency on multipathd?
> 
> There have been severel similar reports in the last months and I wonder 
> if there is a way to make vdsmd just more stable. It would be better to 
> have vdsmd started and report an error to ovirt-engine, instead of 
> failing to start the vdsmd service all the time. The current behaviour 
> makes it hard to debug.
> 
> 
> Thanks
> - Frank
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 



More information about the Users mailing list