Hi Frank,
I sometimes have (had) the same issues with all-in-one-setups, so I don't use local
storage in all-in-one-setup anymore.
Instead I share a directory on my node via NFS, create a new NFS datacenter and mount it
locally.
This might now the best way to do it, but I have better experience with this setup as with
local storage.
Btw, when changing multipath.conf make sure you set "RHEV PRIVATE" below
"RHEV REVISION X.Y" to avoid losing your changes during next reboot.
With iSCSI and FC backends vdsm is working fine in combination with multipath. In such
setups multipath absolutely makes sense, but I also don't understand why multipathing
is used for local storage - disks are controlled by a (hardware) raid controller and
there's no alternate path oVirt could use in case of storage loss or for better
throughput...
Regards,
René
-----Original message-----
From:Frank Wall <fw(a)moov.de>
Sent: Sunday 1st September 2013 16:40
To: users(a)ovirt.org
Subject: Re: [Users] Host stuck in unresponsive state
On 01.09.2013 01:28, Frank Wall wrote:
> OK, for some reason it got stuck trying to start "iscsid" and
> "multipathd". I was able to solve the issues with these services and
> now the real error message is visible:
Did some more fiddling... I removed my /etc/multipath.conf and started
with the new file. Apparently there is a syntax error in this
auto-generated config:
[root@aio ~]# multipath -ll
Sep 01 00:32:27 | multipath.conf +5, invalid keyword: getuid_callout
Sep 01 00:32:27 | multipath.conf +18, invalid keyword: getuid_callout
OK, I removed lines 5 and 18 and now multipathd is working again. This
time it was possible to successfully start vdsmd afterwards:
[root@aio ~]# systemctl status vdsmd.service
vdsmd.service - Virtual Desktop Server Manager
Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
Active: active (running) since So 2013-09-01 16:25:45 CEST; 1min 30s
ago
Process: 3138 ExecStart=/lib/systemd/systemd-vdsmd start (code=exited,
status=0/SUCCESS)
Main PID: 3285 (respawn)
CGroup: name=systemd:/system/vdsmd.service
├─3285 /bin/bash -e /usr/share/vdsm/respawn --minlifetime 10
--daemon --masterpid /var/run/vdsm/respawn.pid /us...
└─3288 /usr/bin/python /usr/share/vdsm/vdsm
Sep 01 16:25:45
aio.exmaple.com python[3288]: DIGEST-MD5 client step 2
Sep 01 16:25:45
aio.exmaple.com python[3288]: DIGEST-MD5
parse_server_challenge()
Sep 01 16:25:45
aio.exmaple.com python[3288]: DIGEST-MD5 ask_user_info()
Sep 01 16:25:45
aio.exmaple.com vdsm[3288]: vdsm vds WARNING Unable to
load the json rpc server module. Please make su...alled.
Sep 01 16:25:45
aio.exmaple.com python[3288]: DIGEST-MD5 client step 2
Sep 01 16:25:45
aio.exmaple.com python[3288]: DIGEST-MD5 ask_user_info()
Sep 01 16:25:45
aio.exmaple.com python[3288]: DIGEST-MD5
make_client_response()
Sep 01 16:25:45
aio.exmaple.com python[3288]: DIGEST-MD5 client step 3
Sep 01 16:25:54
aio.exmaple.com vdsm[3288]: vdsm TaskManager.Task ERROR
Task=`7fc3840c-1518-4260-9f27-ee20434b5a7a`::U... error
Sep 01 16:25:54
aio.exmaple.com vdsm[3288]: vdsm TaskManager.Task ERROR
Task=`82f757b5-a669-40fa-b09d-9cad90c971e1`::U... error
Still, this doesn't feel right. I think vdsmd is just too unstable and
vulnerable. Why did vdsmd core dump with another multipathd config in
place? Why does it even have this strict dependency on multipathd?
There have been severel similar reports in the last months and I wonder
if there is a way to make vdsmd just more stable. It would be better to
have vdsmd started and report an error to ovirt-engine, instead of
failing to start the vdsmd service all the time. The current behaviour
makes it hard to debug.
Thanks
- Frank
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users