
Thanks! Looks very good Regards, Maor ----- Oorspronkelijk bericht -----
Van: ml@ohnewald.net Aan: "Maor Lipchuk" <mlipchuk@redhat.com> Cc: users@ovirt.org Verzonden: Donderdag 21 mei 2015 21:04:33 Onderwerp: Re: [ovirt-users] vdsm storage problem - maybe cache problem?
Done: https://bugzilla.redhat.com/show_bug.cgi?id=1223925
Anything else i forgot?
Thanks, Mario
Am 20.05.2015 um 15:12 schrieb Maor Lipchuk:
Mario,
Can you please open a bug with all the VDSM and engine logs. so this issue can be tracked.
This is the link: https://bugzilla.redhat.com/enter_bug.cgi?product=oVirt
Regards, Maor
----- Original Message -----
From: ml@ohnewald.net Cc: users@ovirt.org Sent: Wednesday, May 20, 2015 1:28:53 PM Subject: Re: [ovirt-users] vdsm storage problem - maybe cache problem?
Hello List,
i really need some help here....could someone please give it a try? I already tried to get help on the irc channel but it seems that my problem here is too complicated or maybe i am providing not useful infos?
DB & vdsClient: http://fpaste.org/223483/14320588/ (i think is part is very intresting)
engine.log: http://paste.fedoraproject.org/223349/04494414 Node02 vdsm Log: http://paste.fedoraproject.org/223350/43204496 Node01 vdsm Log: http://paste.fedoraproject.org/223347/20448951
Why does my vdsm look for StorageDomain 036b5575-51fa-4f14-8b05-890d7807894c ? => This was a NFS Export which i deleted from the GUI yesterday (!!!).
From the Database Log/Dump: ============================= USER_FORCE_REMOVE_STORAGE_DOMAIN 981 0 Storage Domain EXPORT2 was forcibly removed by admin@internal f b384b3da-02a6-44f3-a3f6-56751ce8c26d HP_Proliant_DL180G6 036b5575-51fa-4f14-8b05-890d7807894c EXPORT2 00000000-0000-0000-0000-000000000000 c1754ff 807321f6-2043-4a26-928c-0ce6b423c381
I already put one node into maintance, rebootet and activated it. Still the same problem.
Some Screen Shots: -------------------- http://postimg.org/image/8zo4ujgjb/ http://postimg.org/image/le918grdr/ http://postimg.org/image/wnawwhrgh/
My GlusterFS fully works and is not the problem here i guess: ==============================================================
2015-05-19 04:09:06,292 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-43) [3df9132b] START, HSMClearTaskVDSCommand(HostName = ovirt-node02.stuttgart.imos.net, HostId = 6948da12-0b8a-4b6d-a9af-162e6c25dad3, taskId=2aeec039-5b95-40f0-8410-da62b44a28e8), log id: 19b18840 2015-05-19 04:09:06,337 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (DefaultQuartzScheduler_Worker-43) [3df9132b] FINISH, HSMClearTaskVDSCommand, log id: 19b18840
I already had a chat with Maor Lipchuk and he told me to add another host to the datacenter and then re-initialize it. I will migrate back to ESXi for now to get those two nodes free. Then we can mess with it if anyone is interested to help me. Otherwise i will have to stay with ESXi :-(
Has anyone else an idea until then? Why is my vdsm host all messed up with that zombie StorageDomain?
Thanks for your time. I really, really appreciate it.
Mario
Am 19.05.15 um 14:57 schrieb ml@ohnewald.net:
Hello List,
okay, i really need some help now. I stopped vdsmd for a little bit too long, fencing stepped in and rebootet node01.
I now can NOT start any vm because the storage is marked "Unknown".
Since node01 rebootet i am wondering why the hell it is still looking to this StorageDomain:
StorageDomainCache::(_findDomain) domain 036b5575-51fa-4f14-8b05-890d7807894c not found
Can anyone please please tell me where the id exactly it comes from?
Thanks, Mario
Am 18.05.15 um 14:21 schrieb Maor Lipchuk:
Hi Mario,
Can u try to mount this directly from the Host? Can u please attach the VDSM and engine logs
Thanks, Maor
----- Original Message -----
From: ml@ohnewald.net To: "Maor Lipchuk" <mlipchuk@redhat.com> Cc: users@ovirt.org Sent: Monday, May 18, 2015 2:36:38 PM Subject: Re: [ovirt-users] vdsm storage problem - maybe cache problem?
Hi Maor,
thanks for the quick reply.
Am 18.05.15 um 13:25 schrieb Maor Lipchuk:
>> Now my Question: Why does the vdsm node not know that i deleted the >> storage? Has the vdsm cached this mount informations? Why does it >> still >> try to access 036b5575-51fa-4f14-8b05-890d7807894c? > > Yes, the vdsm use a cache for Storage Domains, you can try to > restart the > vdsmd service instead of rebooting the host. > I am still getting the same error.
[root@ovirt-node01 ~]# /etc/init.d/vdsmd stop Shutting down vdsm daemon: vdsm watchdog stop [ OK ] vdsm: Running run_final_hooks [ OK ] vdsm stop [ OK ] [root@ovirt-node01 ~]# [root@ovirt-node01 ~]# [root@ovirt-node01 ~]# [root@ovirt-node01 ~]# ps aux | grep vdsmd root 3198 0.0 0.0 11304 740 ? S< May07 0:00 /bin/bash -e /usr/share/vdsm/respawn --minlifetime 10 --daemon --masterpid /var/run/vdsm/supervdsm_respawn.pid /usr/share/vdsm/supervdsmServer --sockfile /var/run/vdsm/svdsm.sock --pidfile /var/run/vdsm/supervdsmd.pid root 3205 0.0 0.0 922368 26724 ? S<l May07 12:10 /usr/bin/python /usr/share/vdsm/supervdsmServer --sockfile /var/run/vdsm/svdsm.sock --pidfile /var/run/vdsm/supervdsmd.pid root 15842 0.0 0.0 103248 900 pts/0 S+ 13:35 0:00 grep vdsmd
[root@ovirt-node01 ~]# /etc/init.d/vdsmd start initctl: Job is already running: libvirtd vdsm: Running mkdirs vdsm: Running configure_coredump vdsm: Running configure_vdsm_logs vdsm: Running run_init_hooks vdsm: Running gencerts vdsm: Running check_is_configured libvirt is already configured for vdsm sanlock service is already configured vdsm: Running validate_configuration SUCCESS: ssl configured to true. No conflicts vdsm: Running prepare_transient_repository vdsm: Running syslog_available vdsm: Running nwfilter vdsm: Running dummybr vdsm: Running load_needed_modules vdsm: Running tune_system vdsm: Running test_space vdsm: Running test_lo vdsm: Running restore_nets vdsm: Running unified_network_persistence_upgrade vdsm: Running upgrade_300_nets Starting up vdsm daemon: vdsm start [ OK ] [root@ovirt-node01 ~]#
[root@ovirt-node01 ~]# grep ERROR /var/log/vdsm/vdsm.log | tail -n 20 Thread-13::ERROR::2015-05-18 13:35:03,631::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain abc51e26-7175-4b38-b3a8-95c6928fbc2b Thread-13::ERROR::2015-05-18 13:35:03,632::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
looking for domain abc51e26-7175-4b38-b3a8-95c6928fbc2b Thread-36::ERROR::2015-05-18 13:35:11,607::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain 036b5575-51fa-4f14-8b05-890d7807894c Thread-36::ERROR::2015-05-18 13:35:11,621::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
looking for domain 036b5575-51fa-4f14-8b05-890d7807894c Thread-36::ERROR::2015-05-18 13:35:11,960::sdc::143::Storage.StorageDomainCache::(_findDomain) domain 036b5575-51fa-4f14-8b05-890d7807894c not found Thread-36::ERROR::2015-05-18 13:35:11,960::domainMonitor::239::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain 036b5575-51fa-4f14-8b05-890d7807894c monitoring information Thread-36::ERROR::2015-05-18 13:35:21,962::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain 036b5575-51fa-4f14-8b05-890d7807894c Thread-36::ERROR::2015-05-18 13:35:21,965::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
looking for domain 036b5575-51fa-4f14-8b05-890d7807894c Thread-36::ERROR::2015-05-18 13:35:22,068::sdc::143::Storage.StorageDomainCache::(_findDomain) domain 036b5575-51fa-4f14-8b05-890d7807894c not found Thread-36::ERROR::2015-05-18 13:35:22,072::domainMonitor::239::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain 036b5575-51fa-4f14-8b05-890d7807894c monitoring information Thread-15::ERROR::2015-05-18 13:35:33,821::task::866::TaskManager.Task::(_setError) Task=`54bdfc77-f63a-493b-b24e-e5a3bc4977bb`::Unexpected error Thread-15::ERROR::2015-05-18 13:35:33,864::dispatcher::65::Storage.Dispatcher.Protect::(run) {'status': {'message': "Unknown pool id, pool not connected: ('b384b3da-02a6-44f3-a3f6-56751ce8c26d',)", 'code': 309}} Thread-13::ERROR::2015-05-18 13:35:33,930::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain abc51e26-7175-4b38-b3a8-95c6928fbc2b Thread-15::ERROR::2015-05-18 13:35:33,928::task::866::TaskManager.Task::(_setError) Task=`fe9bb0fa-cf1e-4b21-af00-0698c6d1718f`::Unexpected error Thread-13::ERROR::2015-05-18 13:35:33,932::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
looking for domain abc51e26-7175-4b38-b3a8-95c6928fbc2b Thread-15::ERROR::2015-05-18 13:35:33,978::dispatcher::65::Storage.Dispatcher.Protect::(run) {'status': {'message': 'Not SPM: ()', 'code': 654}} Thread-36::ERROR::2015-05-18 13:35:41,117::sdc::137::Storage.StorageDomainCache::(_findDomain) looking for unfetched domain 036b5575-51fa-4f14-8b05-890d7807894c Thread-36::ERROR::2015-05-18 13:35:41,131::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
looking for domain 036b5575-51fa-4f14-8b05-890d7807894c Thread-36::ERROR::2015-05-18 13:35:41,452::sdc::143::Storage.StorageDomainCache::(_findDomain) domain 036b5575-51fa-4f14-8b05-890d7807894c not found Thread-36::ERROR::2015-05-18 13:35:41,453::domainMonitor::239::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain 036b5575-51fa-4f14-8b05-890d7807894c monitoring information
Thanks, Mario
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users