
On the host that has the Hosted Engine VM, the sanlock.log reports: 2017-06-27 17:30:20+0100 1043742 [7307]: add_lockspace 207221b2-959b-426b-b945-18e1adfed62f:3:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0 conflicts with name of list1 s5 207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0 Again, I'm not sure what has happened here. On Tue, Jun 27, 2017 at 5:26 PM, cmc <iucounu@gmail.com> wrote:
I see this on the host it is trying to migrate in /var/log/sanlock:
2017-06-27 17:10:40+0100 527703 [2407]: s3528 lockspace 207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0 2017-06-27 17:13:00+0100 527843 [27446]: s3528 delta_acquire host_id 1 busy1 1 2 1042692 3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03 2017-06-27 17:13:01+0100 527844 [2407]: s3528 add_lockspace fail result -262
The sanlock service is running. Why would this occur?
Thanks,
C
On Tue, Jun 27, 2017 at 5:21 PM, cmc <iucounu@gmail.com> wrote:
Hi Martin,
Thanks for the reply. I have done this, and the deployment completed without error. However, it still will not allow the Hosted Engine migrate to another host. The /etc/ovirt-hosted-engine/hosted-engine.conf got created ok on the host I re-installed, but the ovirt-ha-broker.service, though it starts, reports:
--------------------8<-------------------
Jun 27 14:58:26 kvm-ldn-01 systemd[1]: Starting oVirt Hosted Engine High Availability Communications Broker... Jun 27 14:58:27 kvm-ldn-01 ovirt-ha-broker[6101]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker ERROR Failed to read metadata from /rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 129, in get_raw_stats_for_service_type f = os.open(path, direct_flag | os.O_RDONLY | os.O_SYNC) OSError: [Errno 2] No such file or directory: '/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata'
--------------------8<-------------------
I checked the path, and it exists. I can run 'less -f' on it fine. The perms are slightly different on the host that is running the VM vs the one that is reporting errors (600 vs 660), ownership is vdsm:qemu. Is this a san locking issue?
Thanks for any help,
Cam
On Tue, Jun 27, 2017 at 1:41 PM, Martin Sivak <msivak@redhat.com> wrote:
Should it be? It was not in the instructions for the migration from bare-metal to Hosted VM
The hosted engine will only migrate to hosts that have the services running. Please put one other host to maintenance and select Hosted engine action: DEPLOY in the reinstall dialog.
Best regards
Martin Sivak
On Tue, Jun 27, 2017 at 1:23 PM, cmc <iucounu@gmail.com> wrote:
I changed the 'os.other.devices.display.protocols.value.3.6 = spice/qxl,vnc/cirrus,vnc/qxl' line to have the same display protocols as 4 and the hosted engine now appears in the list of VMs. I am guessing the compatibility version was causing it to use the 3.6 version. However, I am still unable to migrate the engine VM to another host. When I try putting the host it is currently on into maintenance, it reports:
Error while executing action: Cannot switch the Host(s) to Maintenance mode. There are no available hosts capable of running the engine VM.
Running 'hosted-engine --vm-status' still shows 'Engine status: unknown stale-data'.
The ovirt-ha-broker service is only running on one host. It was set to 'disabled' in systemd. It won't start as there is no /etc/ovirt-hosted-engine/hosted-engine.conf on the other two hosts. Should it be? It was not in the instructions for the migration from bare-metal to Hosted VM
Thanks,
Cam
On Thu, Jun 22, 2017 at 1:07 PM, cmc <iucounu@gmail.com> wrote:
Hi Tomas,
So in my /usr/share/ovirt-engine/conf/osinfo-defaults.properties on my engine VM, I have:
os.other.devices.display.protocols.value = spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus os.other.devices.display.protocols.value.3.6 = spice/qxl,vnc/cirrus,vnc/qxl
That seems to match - I assume since this is 4.1, the 3.6 should not apply
Is there somewhere else I should be looking?
Thanks,
Cam
On Thu, Jun 22, 2017 at 11:40 AM, Tomas Jelinek <tjelinek@redhat.com> wrote:
On Thu, Jun 22, 2017 at 12:38 PM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote: > > > > On 22 Jun 2017, at 12:31, Martin Sivak <msivak@redhat.com> wrote: > > > > Tomas, what fields are needed in a VM to pass the check that causes > > the following error? > > > >>>>> WARN [org.ovirt.engine.core.bll.exportimport.ImportVmCommand] > >>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action > >>>>> 'ImportVm' > >>>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT > >>>>> > >>>>> ,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS > > to match the OS and VM Display type;-) > Configuration is in osinfo….e.g. if that is import from older releases on > Linux this is typically caused by the cahgen of cirrus to vga for non-SPICE > VMs
yep, the default supported combinations for 4.0+ is this: os.other.devices.display.protocols.value = spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
> > > > > > Thanks. > > > > On Thu, Jun 22, 2017 at 12:19 PM, cmc <iucounu@gmail.com> wrote: > >> Hi Martin, > >> > >>> > >>> just as a random comment, do you still have the database backup from > >>> the bare metal -> VM attempt? It might be possible to just try again > >>> using it. Or in the worst case.. update the offending value there > >>> before restoring it to the new engine instance. > >> > >> I still have the backup. I'd rather do the latter, as re-running the > >> HE deployment is quite lengthy and involved (I have to re-initialise > >> the FC storage each time). Do you know what the offending value(s) > >> would be? Would it be in the Postgres DB or in a config file > >> somewhere? > >> > >> Cheers, > >> > >> Cam > >> > >>> Regards > >>> > >>> Martin Sivak > >>> > >>> On Thu, Jun 22, 2017 at 11:39 AM, cmc <iucounu@gmail.com> wrote: > >>>> Hi Yanir, > >>>> > >>>> Thanks for the reply. > >>>> > >>>>> First of all, maybe a chain reaction of : > >>>>> WARN [org.ovirt.engine.core.bll.exportimport.ImportVmCommand] > >>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action > >>>>> 'ImportVm' > >>>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT > >>>>> > >>>>> ,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS > >>>>> is causing the hosted engine vm not to be set up correctly and > >>>>> further > >>>>> actions were made when the hosted engine vm wasnt in a stable state. > >>>>> > >>>>> As for now, are you trying to revert back to a previous/initial > >>>>> state ? > >>>> > >>>> I'm not trying to revert it to a previous state for now. This was a > >>>> migration from a bare metal engine, and it didn't report any error > >>>> during the migration. I'd had some problems on my first attempts at > >>>> this migration, whereby it never completed (due to a proxy issue) but > >>>> I managed to resolve this. Do you know of a way to get the Hosted > >>>> Engine VM into a stable state, without rebuilding the entire cluster > >>>> from scratch (since I have a lot of VMs on it)? > >>>> > >>>> Thanks for any help. > >>>> > >>>> Regards, > >>>> > >>>> Cam > >>>> > >>>>> Regards, > >>>>> Yanir > >>>>> > >>>>> On Wed, Jun 21, 2017 at 4:32 PM, cmc <iucounu@gmail.com> wrote: > >>>>>> > >>>>>> Hi Jenny/Martin, > >>>>>> > >>>>>> Any idea what I can do here? The hosted engine VM has no log on any > >>>>>> host in /var/log/libvirt/qemu, and I fear that if I need to put the > >>>>>> host into maintenance, e.g., to upgrade it that I created it on > >>>>>> (which > >>>>>> I think is hosting it), or if it fails for any reason, it won't get > >>>>>> migrated to another host, and I will not be able to manage the > >>>>>> cluster. It seems to be a very dangerous position to be in. > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Cam > >>>>>> > >>>>>> On Wed, Jun 21, 2017 at 11:48 AM, cmc <iucounu@gmail.com> wrote: > >>>>>>> Thanks Martin. The hosts are all part of the same cluster. > >>>>>>> > >>>>>>> I get these errors in the engine.log on the engine: > >>>>>>> > >>>>>>> 2017-06-19 03:28:05,030Z WARN > >>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand] > >>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of action > >>>>>>> 'ImportVm' > >>>>>>> failed for user SYST > >>>>>>> EM. Reasons: > >>>>>>> > >>>>>>> VAR__ACTION__IMPORT,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS > >>>>>>> 2017-06-19 03:28:05,030Z INFO > >>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand] > >>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Lock freed to object > >>>>>>> 'EngineLock:{exclusiveLocks='[a > >>>>>>> 79e6b0e-fff4-4cba-a02c-4c00be151300=<VM, > >>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName HostedEngine>, > >>>>>>> HostedEngine=<VM_NAME, ACTION_TYPE_FAILED_NAME_ALREADY_USED>]', > >>>>>>> sharedLocks= > >>>>>>> '[a79e6b0e-fff4-4cba-a02c-4c00be151300=<REMOTE_VM, > >>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName HostedEngine>]'}' > >>>>>>> 2017-06-19 03:28:05,030Z ERROR > >>>>>>> [org.ovirt.engine.core.bll.HostedEngineImporter] > >>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Failed importing the Hosted > >>>>>>> Engine VM > >>>>>>> > >>>>>>> The sanlock.log reports conflicts on that same host, and a > >>>>>>> different > >>>>>>> error on the other hosts, not sure if they are related. > >>>>>>> > >>>>>>> And this in the /var/log/ovirt-hosted-engine-ha/agent log on the > >>>>>>> host > >>>>>>> which I deployed the hosted engine VM on: > >>>>>>> > >>>>>>> MainThread::ERROR::2017-06-19 > >>>>>>> > >>>>>>> > >>>>>>> 13:09:49,743::ovf_store::124::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) > >>>>>>> Unable to extract HEVM OVF > >>>>>>> MainThread::ERROR::2017-06-19 > >>>>>>> > >>>>>>> > >>>>>>> 13:09:49,743::config::445::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) > >>>>>>> Failed extracting VM OVF from the OVF_STORE volume, falling back > >>>>>>> to > >>>>>>> initial vm.conf > >>>>>>> > >>>>>>> I've seen some of these issues reported in bugzilla, but they were > >>>>>>> for > >>>>>>> older versions of oVirt (and appear to be resolved). > >>>>>>> > >>>>>>> I will install that package on the other two hosts, for which I > >>>>>>> will > >>>>>>> put them in maintenance as vdsm is installed as an upgrade. I > >>>>>>> guess > >>>>>>> restarting vdsm is a good idea after that? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> Campbell > >>>>>>> > >>>>>>> On Wed, Jun 21, 2017 at 10:51 AM, Martin Sivak <msivak@redhat.com> > >>>>>>> wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> you do not have to install it on all hosts. But you should have > >>>>>>>> more > >>>>>>>> than one and ideally all hosted engine enabled nodes should > >>>>>>>> belong to > >>>>>>>> the same engine cluster. > >>>>>>>> > >>>>>>>> Best regards > >>>>>>>> > >>>>>>>> Martin Sivak > >>>>>>>> > >>>>>>>> On Wed, Jun 21, 2017 at 11:29 AM, cmc <iucounu@gmail.com> wrote: > >>>>>>>>> Hi Jenny, > >>>>>>>>> > >>>>>>>>> Does ovirt-hosted-engine-ha need to be installed across all > >>>>>>>>> hosts? > >>>>>>>>> Could that be the reason it is failing to see it properly? > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> > >>>>>>>>> Cam > >>>>>>>>> > >>>>>>>>> On Mon, Jun 19, 2017 at 1:27 PM, cmc <iucounu@gmail.com> wrote: > >>>>>>>>>> Hi Jenny, > >>>>>>>>>> > >>>>>>>>>> Logs are attached. I can see errors in there, but am unsure how > >>>>>>>>>> they > >>>>>>>>>> arose. > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> > >>>>>>>>>> Campbell > >>>>>>>>>> > >>>>>>>>>> On Mon, Jun 19, 2017 at 12:29 PM, Evgenia Tokar > >>>>>>>>>> <etokar@redhat.com> > >>>>>>>>>> wrote: > >>>>>>>>>>> From the output it looks like the agent is down, try starting > >>>>>>>>>>> it by > >>>>>>>>>>> running: > >>>>>>>>>>> systemctl start ovirt-ha-agent. > >>>>>>>>>>> > >>>>>>>>>>> The engine is supposed to see the hosted engine storage domain > >>>>>>>>>>> and > >>>>>>>>>>> import it > >>>>>>>>>>> to the system, then it should import the hosted engine vm. > >>>>>>>>>>> > >>>>>>>>>>> Can you attach the agent log from the host > >>>>>>>>>>> (/var/log/ovirt-hosted-engine-ha/agent.log) > >>>>>>>>>>> and the engine log from the engine vm > >>>>>>>>>>> (/var/log/ovirt-engine/engine.log)? > >>>>>>>>>>> > >>>>>>>>>>> Thanks, > >>>>>>>>>>> Jenny > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On Mon, Jun 19, 2017 at 12:41 PM, cmc <iucounu@gmail.com> > >>>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> Hi Jenny, > >>>>>>>>>>>> > >>>>>>>>>>>>> What version are you running? > >>>>>>>>>>>> > >>>>>>>>>>>> 4.1.2.2-1.el7.centos > >>>>>>>>>>>> > >>>>>>>>>>>>> For the hosted engine vm to be imported and displayed in the > >>>>>>>>>>>>> engine, you > >>>>>>>>>>>>> must first create a master storage domain. > >>>>>>>>>>>> > >>>>>>>>>>>> To provide a bit more detail: this was a migration of a > >>>>>>>>>>>> bare-metal > >>>>>>>>>>>> engine in an existing cluster to a hosted engine VM for that > >>>>>>>>>>>> cluster. > >>>>>>>>>>>> As part of this migration, I built an entirely new host and > >>>>>>>>>>>> ran > >>>>>>>>>>>> 'hosted-engine --deploy' (followed these instructions: > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> http://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Meta...). > >>>>>>>>>>>> I restored the backup from the engine and it completed > >>>>>>>>>>>> without any > >>>>>>>>>>>> errors. I didn't see any instructions regarding a master > >>>>>>>>>>>> storage > >>>>>>>>>>>> domain in the page above. The cluster has two existing master > >>>>>>>>>>>> storage > >>>>>>>>>>>> domains, one is fibre channel, which is up, and one ISO > >>>>>>>>>>>> domain, > >>>>>>>>>>>> which > >>>>>>>>>>>> is currently offline. > >>>>>>>>>>>> > >>>>>>>>>>>>> What do you mean the hosted engine commands are failing? > >>>>>>>>>>>>> What > >>>>>>>>>>>>> happens > >>>>>>>>>>>>> when > >>>>>>>>>>>>> you run hosted-engine --vm-status now? > >>>>>>>>>>>> > >>>>>>>>>>>> Interestingly, whereas when I ran it before, it exited with > >>>>>>>>>>>> no > >>>>>>>>>>>> output > >>>>>>>>>>>> and a return code of '1', it now reports: > >>>>>>>>>>>> > >>>>>>>>>>>> --== Host 1 status ==-- > >>>>>>>>>>>> > >>>>>>>>>>>> conf_on_shared_storage : True > >>>>>>>>>>>> Status up-to-date : False > >>>>>>>>>>>> Hostname : > >>>>>>>>>>>> kvm-ldn-03.ldn.fscfc.co.uk > >>>>>>>>>>>> Host ID : 1 > >>>>>>>>>>>> Engine status : unknown stale-data > >>>>>>>>>>>> Score : 0 > >>>>>>>>>>>> stopped : True > >>>>>>>>>>>> Local maintenance : False > >>>>>>>>>>>> crc32 : 0217f07b > >>>>>>>>>>>> local_conf_timestamp : 2911 > >>>>>>>>>>>> Host timestamp : 2897 > >>>>>>>>>>>> Extra metadata (valid at timestamp): > >>>>>>>>>>>> metadata_parse_version=1 > >>>>>>>>>>>> metadata_feature_version=1 > >>>>>>>>>>>> timestamp=2897 (Thu Jun 15 16:22:54 2017) > >>>>>>>>>>>> host-id=1 > >>>>>>>>>>>> score=0 > >>>>>>>>>>>> vm_conf_refresh_time=2911 (Thu Jun 15 16:23:08 2017) > >>>>>>>>>>>> conf_on_shared_storage=True > >>>>>>>>>>>> maintenance=False > >>>>>>>>>>>> state=AgentStopped > >>>>>>>>>>>> stopped=True > >>>>>>>>>>>> > >>>>>>>>>>>> Yet I can login to the web GUI fine. I guess it is not HA due > >>>>>>>>>>>> to > >>>>>>>>>>>> being > >>>>>>>>>>>> in an unknown state currently? Does the hosted-engine-ha rpm > >>>>>>>>>>>> need > >>>>>>>>>>>> to > >>>>>>>>>>>> be installed across all nodes in the cluster, btw? > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks for the help, > >>>>>>>>>>>> > >>>>>>>>>>>> Cam > >>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> Jenny Tokar > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Thu, Jun 15, 2017 at 6:32 PM, cmc <iucounu@gmail.com> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I've migrated from a bare-metal engine to a hosted engine. > >>>>>>>>>>>>>> There > >>>>>>>>>>>>>> were > >>>>>>>>>>>>>> no errors during the install, however, the hosted engine > >>>>>>>>>>>>>> did not > >>>>>>>>>>>>>> get > >>>>>>>>>>>>>> started. I tried running: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> hosted-engine --status > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> on the host I deployed it on, and it returns nothing (exit > >>>>>>>>>>>>>> code > >>>>>>>>>>>>>> is 1 > >>>>>>>>>>>>>> however). I could not ping it either. So I tried starting > >>>>>>>>>>>>>> it via > >>>>>>>>>>>>>> 'hosted-engine --vm-start' and it returned: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Virtual machine does not exist > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> But it then became available. I logged into it > >>>>>>>>>>>>>> successfully. It > >>>>>>>>>>>>>> is not > >>>>>>>>>>>>>> in the list of VMs however. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Any ideas why the hosted-engine commands fail, and why it > >>>>>>>>>>>>>> is not > >>>>>>>>>>>>>> in > >>>>>>>>>>>>>> the list of virtual machines? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks for any help, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Cam > >>>>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>>>> Users mailing list > >>>>>>>>>>>>>> Users@ovirt.org > >>>>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> _______________________________________________ > >>>>>>>>> Users mailing list > >>>>>>>>> Users@ovirt.org > >>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users > >>>>>> _______________________________________________ > >>>>>> Users mailing list > >>>>>> Users@ovirt.org > >>>>>> http://lists.ovirt.org/mailman/listinfo/users > >>>>> > >>>>> > > _______________________________________________ > > Users mailing list > > Users@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/users > > > > >