<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 12pt;
font-family:Calibri
}
--></style></head>
<body class='hmmessage'><div dir='ltr'>> Date: Tue, 11 Mar 2014 15:16:36 +0100<br><div>> From: sbonazzo@redhat.com<br>> To: giuseppe.ragusa@hotmail.com; jbrooks@redhat.com; msivak@redhat.com<br>> CC: users@ovirt.org; fsimonce@redhat.com; gpadgett@redhat.com<br>> Subject: Re: [Users] hosted engine help<br>> <br>> Il 10/03/2014 22:32, Giuseppe Ragusa ha scritto:<br>> > Hi all,<br>> > <br>> >> Date: Mon, 10 Mar 2014 12:56:19 -0400<br>> >> From: jbrooks@redhat.com<br>> >> To: msivak@redhat.com<br>> >> CC: users@ovirt.org<br>> >> Subject: Re: [Users] hosted engine help<br>> >><br>> >><br>> >><br>> >> ----- Original Message -----<br>> >> > From: "Martin Sivak" <msivak@redhat.com><br>> >> > To: "Dan Kenigsberg" <danken@redhat.com><br>> >> > Cc: users@ovirt.org<br>> >> > Sent: Saturday, March 8, 2014 11:52:59 PM<br>> >> > Subject: Re: [Users] hosted engine help<br>> >> ><br>> >> > Hi Jason,<br>> >> ><br>> >> > can you please attach the full logs? We had very similar issue before I we<br>> >> > need to see if is the same or not.<br>> >><br>> >> I may have to recreate it -- I switched back to an all in one engine after my<br>> >> setup started refusing to run the engine at all. It's no fun losing your engine!<br>> >><br>> >> This was a migrated-from-standalone setup, maybe that caused additional wrinkles...<br>> >><br>> >> Jason<br>> >><br>> >> ><br>> >> > Thanks<br>> > <br>> > I experienced the exact same symptoms as Jason on a from-scratch installation on two physical nodes with CentOS 6.5 (fully up-to-date) using oVirt<br>> > 3.4.0_pre (latest test-day release) and GlusterFS 3.5.0beta3 (with Gluster-provided NFS as storage for the self-hosted engine VM only).<br>> <br>> Using GlusterFS with hosted-engine storage is not supported and not recommended.<br>> HA daemon may not work properly there.<br><br>If it is unsupported (and particularly "not recommended") even with the interposed NFS (the native Gluster-provided NFSv3 export of a volume), then which is the recommended way to setup a fault-tolerant load-balanced 2 node oVirt cluster (without external dedicated SAN/NAS)?<br><br>> > I roughly followed the guide from Andrew Lau:<br>> > <br>> > http://www.andrewklau.com/ovirt-hosted-engine-with-3-4-0-nightly/<br>> > <br>> > with some variations due to newer packages (resolved bugs) and different hardware setup (no VLANs in my setup: physically separated networks; custom<br>> > second nic added to Engine VM template before deploying etc.)<br>> > <br>> > The self-hosted installation on first node + Engine VM (configured for managing both oVirt and the storage; Datacenter default set to NFS because no<br>> > GlusterFS offered) went apparently smooth, but the HA-agent failed to start at the very end (same errors in logs as Jason: the storage domain seems<br>> > "missing") and I was only able to start it all manually with:<br>> > <br>> > hosted-engine --connect-storage<br>> > hosted-engine --start-pool<br>> <br>> The above commands are used for development and shouldn't be used for starting the engine.<br><br>Directly starting the engine (with the command below) failed because of storage unavailability, so I used the above "trick" as a "last resort" to at least prove that the engine was able to start and had not been somewhat "destroyed" or "lost" in the process (but I do understand that it is an extreme debug-only action).<br><br>> > hosted-engine --vm-start<br>> > <br>> > then the Engine came up and I could use it, I even registered the second node (same final error in HA-agent) and tried to add GlusterFS storage<br>> > domains for further VMs and ISOs (by the way: the original NFS-GlusterFS domain for Engine VM only is not present inside the Engine web UI) but it<br>> > always failed activating the domains (they remain "Inactive").<br>> > <br>> > Furthermore the engine gets killed some time after starting (from 3 up to 11 hours later) and the only way to get it back is repeating the above commands.<br>> <br>> Need logs for this.<br><br>I will try to reproduce it all, but I can recall that on libvirt logs (HostedEngine.log) there was always clear indication of the PID that killed the Engine VM and each time it belonged to an instance of sanlock.<br><br>> > I always managed GlusterFS "natively" (not through oVirt) from the commandline and verified that the NFS-exported Engine-VM-only volume gets<br>> > replicated, but I obviously failed to try migration because the HA part results inactive and oVirt refuse to migrate the Engine.<br>> > <br>> > Since I tried many times, with variations and further manual actions between (like trying to manually mount the NFS Engine domain, restarting the<br>> > HA-agent only etc.), my logs are "cluttered", so I should start from scratch again and pack up all logs in one swipe.<br>> <br>> +1<br><br>;><br><br>> > Tell me what I should capture and at which points in the whole process and I will try to follow up as soon as possible.<br>> <br>> What:<br>> hosted-engine-setup, hosted-engine-ha, vdsm, libvirt, sanlock from the physical hosts and engine and server logs from the hosted engine VM.<br>> <br>> When:<br>> As soon as you see an error.<br><br>If the setup design (wholly GlusterFS based) is somewhat flawed, please point me to some hint/docs/guide for the right way of setting it up on 2 standalone physical nodes, so as not to waste your time in chasing "defects" in something that is not supposed to be working anyway.<br><br>I will follow your advice and try it accordingly.<br><br>Many thanks again,<br>Giuseppe<br><br>> > Many thanks,<br>> > Giuseppe<br>> > <br>> >> > --<br>> >> > Martin Sivák<br>> >> > msivak@redhat.com<br>> >> > Red Hat Czech<br>> >> > RHEV-M SLA / Brno, CZ<br>> >> ><br>> >> > ----- Original Message -----<br>> >> > > On Fri, Mar 07, 2014 at 10:17:43AM +0100, Sandro Bonazzola wrote:<br>> >> > > > Il 07/03/2014 01:10, Jason Brooks ha scritto:<br>> >> > > > > Hey everyone, I've been testing out oVirt 3.4 w/ hosted engine, and<br>> >> > > > > while I've managed to bring the engine up, I've only been able to do it<br>> >> > > > > manually, using "hosted-engine --vm-start".<br>> >> > > > ><br>> >> > > > > The ovirt-ha-agent service fails reliably for me, erroring out with<br>> >> > > > > "RequestError: Request failed: success."<br>> >> > > > ><br>> >> > > > > I've pasted error passages from the ha agent and vdsm logs below.<br>> >> > > > ><br>> >> > > > > Any pointers?<br>> >> > > ><br>> >> > > > looks like a VDSM bug, Dan?<br>> >> > ><br>> >> > > Why? The exception is raised from deep inside the ovirt_hosted_engine_ha<br>> >> > > code.<br>> >> > > _______________________________________________<br>> >> > > Users mailing list<br>> >> > > Users@ovirt.org<br>> >> > > http://lists.ovirt.org/mailman/listinfo/users<br>> >> > ><br>> >> > _______________________________________________<br>> >> > Users mailing list<br>> >> > Users@ovirt.org<br>> >> > http://lists.ovirt.org/mailman/listinfo/users<br>> >> ><br>> >> _______________________________________________<br>> >> Users mailing list<br>> >> Users@ovirt.org<br>> >> http://lists.ovirt.org/mailman/listinfo/users<br>> > <br>> > <br>> > _______________________________________________<br>> > Users mailing list<br>> > Users@ovirt.org<br>> > http://lists.ovirt.org/mailman/listinfo/users<br>> > <br>> <br>> <br>> -- <br>> Sandro Bonazzola<br>> Better technology. Faster innovation. Powered by community collaboration.<br>> See how it works at redhat.com<br></div>                                            </div></body>
</html>