Re: [Users] AcquireHostId problem

Dec 20 23:43:59 lab2 kernel: [183033.639261] softdog: Software Watchdog Timer: 0.08 initialized. soft_noboot=0 soft_margin=60 sec soft_panic=0 (nowayout=0) Dec 20 23:44:11 lab2 systemd[1]: Starting Watchdog Multiplexing Daemon... Dec 20 23:44:11 lab2 wdmd[25072]: wdmd started S0 H1 G179 Dec 20 23:44:11 lab2 systemd-wdmd[25066]: Starting wdmd: [ OK ] Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog0 failed to set timeout Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog0 disarmed Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog1 armed with fire_timeout 60 Dec 20 23:44:11 lab2 systemd[1]: Started Watchdog Multiplexing Daemon. Dec 20 23:45:33 lab2 rpc.mountd[2819]: authenticated mount request from 192.168.1.41:994 for /home/vdsm/data (/home/vdsm/data) Dec 20 23:45:39 lab2 rpc.mountd[2819]: authenticated mount request from 192.168.1.41:954 for /home/vdsm/data (/home/vdsm/data) Seems to work a bit. However I still get "unable to attach storage" when creating a domain.... 2013/12/20 Federico Simoncelli <fsimonce@redhat.com>
----- Original Message -----
From: "Pascal Jakobi" <pascal.jakobi@gmail.com> To: "Federico Simoncelli" <fsimonce@redhat.com> Sent: Friday, December 20, 2013 4:44:56 PM Subject: Re: [Users] AcquireHostId problem
*Bug 1045512 created* < https://bugzilla.redhat.com/show_bug.cgi?id=1045512>
It looks perfect. Thanks.
-- Federico
-- *Pascal Jakobi* 116 rue de Stalingrad 93100 Montreuil, France *+33 6 87 47 58 19*Pascal.Jakobi@gmail.com

----- Original Message -----
From: "Pascal Jakobi" <pascal.jakobi@gmail.com> To: "Federico Simoncelli" <fsimonce@redhat.com>, users@ovirt.org Sent: Friday, December 20, 2013 11:54:21 PM Subject: Re: [Users] AcquireHostId problem
Dec 20 23:43:59 lab2 kernel: [183033.639261] softdog: Software Watchdog Timer: 0.08 initialized. soft_noboot=0 soft_margin=60 sec soft_panic=0 (nowayout=0) Dec 20 23:44:11 lab2 systemd[1]: Starting Watchdog Multiplexing Daemon... Dec 20 23:44:11 lab2 wdmd[25072]: wdmd started S0 H1 G179 Dec 20 23:44:11 lab2 systemd-wdmd[25066]: Starting wdmd: [ OK ] Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog0 failed to set timeout Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog0 disarmed Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog1 armed with fire_timeout 60 Dec 20 23:44:11 lab2 systemd[1]: Started Watchdog Multiplexing Daemon. Dec 20 23:45:33 lab2 rpc.mountd[2819]: authenticated mount request from 192.168.1.41:994 for /home/vdsm/data (/home/vdsm/data) Dec 20 23:45:39 lab2 rpc.mountd[2819]: authenticated mount request from 192.168.1.41:954 for /home/vdsm/data (/home/vdsm/data)
Seems to work a bit. However I still get "unable to attach storage" when creating a domain....
It is probably a different error now. Anything interesting in vdsm.log? -- Federico

Nope. Nothing more in logs. My guess is that the timeout problem generates the error. However, in reality if you run "mount", you have the target partitions mounted.... Therefore, I guess the problem is to understand why "dev/watchdog0 failed to set timeout".... Any info needed, just ask 2013/12/21 Federico Simoncelli <fsimonce@redhat.com>
----- Original Message -----
From: "Pascal Jakobi" <pascal.jakobi@gmail.com> To: "Federico Simoncelli" <fsimonce@redhat.com>, users@ovirt.org Sent: Friday, December 20, 2013 11:54:21 PM Subject: Re: [Users] AcquireHostId problem
Dec 20 23:43:59 lab2 kernel: [183033.639261] softdog: Software Watchdog Timer: 0.08 initialized. soft_noboot=0 soft_margin=60 sec soft_panic=0 (nowayout=0) Dec 20 23:44:11 lab2 systemd[1]: Starting Watchdog Multiplexing Daemon... Dec 20 23:44:11 lab2 wdmd[25072]: wdmd started S0 H1 G179 Dec 20 23:44:11 lab2 systemd-wdmd[25066]: Starting wdmd: [ OK ] Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog0 failed to set timeout Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog0 disarmed Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog1 armed with fire_timeout 60 Dec 20 23:44:11 lab2 systemd[1]: Started Watchdog Multiplexing Daemon. Dec 20 23:45:33 lab2 rpc.mountd[2819]: authenticated mount request from 192.168.1.41:994 for /home/vdsm/data (/home/vdsm/data) Dec 20 23:45:39 lab2 rpc.mountd[2819]: authenticated mount request from 192.168.1.41:954 for /home/vdsm/data (/home/vdsm/data)
Seems to work a bit. However I still get "unable to attach storage" when creating a domain....
It is probably a different error now. Anything interesting in vdsm.log?
-- Federico
-- *Pascal Jakobi* 116 rue de Stalingrad 93100 Montreuil, France *+33 6 87 47 58 19*Pascal.Jakobi@gmail.com

----- Original Message -----
From: "Pascal Jakobi" <pascal.jakobi@gmail.com> To: "Federico Simoncelli" <fsimonce@redhat.com> Cc: users@ovirt.org Sent: Saturday, December 21, 2013 12:37:51 PM Subject: Re: [Users] AcquireHostId problem
Nope. Nothing more in logs. My guess is that the timeout problem generates the error. However, in reality if you run "mount", you have the target partitions mounted....
If you still have a problem and something is not working there's an error somewhere and we only have to find it. Look in the engine logs and in the vdsm logs for any error (not only "Traceback" but also "ERROR"). Try to describe with more details what you're trying to do, what you expect to happen and what is happening instead.
Therefore, I guess the problem is to understand why "dev/watchdog0 failed to set timeout"....
The watchdog provided by your laptop is not working properly or it's not able to set the timeout we need. You inserted the softdog module and wdmd is now up and running as you reported with:
2013/12/21 Federico Simoncelli <fsimonce@redhat.com>
----- Original Message -----
From: "Pascal Jakobi" <pascal.jakobi@gmail.com> To: "Federico Simoncelli" <fsimonce@redhat.com>, users@ovirt.org Sent: Friday, December 20, 2013 11:54:21 PM Subject: Re: [Users] AcquireHostId problem
Dec 20 23:43:59 lab2 kernel: [183033.639261] softdog: Software Watchdog Timer: 0.08 initialized. soft_noboot=0 soft_margin=60 sec soft_panic=0 (nowayout=0) Dec 20 23:44:11 lab2 systemd[1]: Starting Watchdog Multiplexing Daemon... Dec 20 23:44:11 lab2 wdmd[25072]: wdmd started S0 H1 G179 Dec 20 23:44:11 lab2 systemd-wdmd[25066]: Starting wdmd: [ OK ] Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog0 failed to set timeout Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog0 disarmed Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog1 armed with fire_timeout 60 Dec 20 23:44:11 lab2 systemd[1]: Started Watchdog Multiplexing Daemon.
So as far as I can see this part is working. -- Federico

Ok. What I am doing is just adding a new NFS domain that fails : "Failed to add Storage Domain DataLab2. (User: admin@internal)" And I thought that the "/dev/watchdog0 failed to set timeout" msg was signalling an error. Will rerun test in 5 minutes. Thxs

----- Original Message -----
From: "Pascal Jakobi" <pascal.jakobi@gmail.com> To: users@ovirt.org Sent: Monday, December 23, 2013 4:33:23 PM Subject: Re: [Users] AcquireHostId problem
Ok. What I am doing is just adding a new NFS domain that fails : "Failed to add Storage Domain DataLab2. (User: admin@internal)" And I thought that the "/dev/watchdog0 failed to set timeout" msg was signalling an error.
No that's just the attempt to use the laptop watchdog but then it fallbacks to the softdog one: Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog1 armed with fire_timeout 60 Dec 20 23:44:11 lab2 systemd[1]: Started Watchdog Multiplexing Daemon. -- Federico

Here is the message I get on the console : Error while executing action Attach Storage Domain: AcquireHostIdFailure The software seems to go pretty far : it reaches the locked stated before failing. In engine.log 2013-12-23 16:56:49,497 ERROR [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (ajp--127.0.0.1-8702-2) Command org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS, error = Cannot acquire host id: ('8c626a3f-5846-434e-83d8-6238e1ff9e03', SanlockException(-203, 'Sanlock lockspace add failure', 'Sanlock exception')) (Failed with error AcquireHostIdFailure and code 661) Can this help ? 2013/12/23 Federico Simoncelli <fsimonce@redhat.com>
----- Original Message -----
From: "Pascal Jakobi" <pascal.jakobi@gmail.com> To: users@ovirt.org Sent: Monday, December 23, 2013 4:33:23 PM Subject: Re: [Users] AcquireHostId problem
Ok. What I am doing is just adding a new NFS domain that fails : "Failed to add Storage Domain DataLab2. (User: admin@internal)" And I thought that the "/dev/watchdog0 failed to set timeout" msg was signalling an error.
No that's just the attempt to use the laptop watchdog but then it fallbacks to the softdog one:
Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog1 armed with fire_timeout 60 Dec 20 23:44:11 lab2 systemd[1]: Started Watchdog Multiplexing Daemon.
-- Federico
-- *Pascal Jakobi* 116 rue de Stalingrad 93100 Montreuil, France *+33 6 87 47 58 19*Pascal.Jakobi@gmail.com

----- Original Message -----
From: "Pascal Jakobi" <pascal.jakobi@gmail.com> To: users@ovirt.org Sent: Monday, December 23, 2013 5:34:55 PM Subject: Re: [Users] AcquireHostId problem
Here is the message I get on the console : Error while executing action Attach Storage Domain: AcquireHostIdFailure The software seems to go pretty far : it reaches the locked stated before failing.
In engine.log 2013-12-23 16:56:49,497 ERROR [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand] (ajp--127.0.0.1-8702-2) Command org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS, error = Cannot acquire host id: ('8c626a3f-5846-434e-83d8-6238e1ff9e03', SanlockException(-203, 'Sanlock lockspace add failure', 'Sanlock exception')) (Failed with error AcquireHostIdFailure and code 661)
Can this help ?
What's the logs in VDSM? Is this the same host where wdmd was up and running or another one? If you restarted your laptop and you didn't persist the module loading (following the instruction in one of my previous emails) you'll end up in the same problem every time. -- Federico
participants (2)
-
Federico Simoncelli
-
Pascal Jakobi