
Hi - I apologize in advance that this email is less about a specific problem and more a general inquiry as to the most recommended / likely-to-be-successful way path. I am on my third attempt to get an ovirt system up and running (using a collection of spare servers, that meet the requirements set out in the user guide). I'm looking to implement a viable evolution to the unscalable stove-piped ESXi servers (i.e. which are free). And while I'm happy to learn more about the underpinnings, I recognize that to really be a replacement for these VMWare solutions, this has to mostly "Just Work" -- and that's eventually why I've given up on previous occasions (after a couple days of work) and decided to revisit in 6 months. My basic server setup is: - oVirt Mgmt (engine) - oVirt HV1 (hypervisor node) - oVirt HV2 (hypervisor node) - oVirt Disk (NFS share) 1st attempt: I installed the latest stable Node image (2.5.1) on the HV1 and HV2 machines and re-installed the Mgmt server w/ Fedora 17 (64-bit) and all the latest stable engine packages. For the first time, Node installation and Engine setup all went flawlessly. But I could not mount the NFS shares. Upon deeper research, this appeared to be the bug mentioned about NFS; I was *sure* that the official stable Node image would have had a downgraded kernel, but apparently not :) I have no idea if there is an officially supported way to downgrade the kernel on the Node images; the warnings say that any changes will not persist, so I assume there is not. (I am frankly a little surprised that the official stable packages & ISO won't actually work to mount NFS shares, which is the recommended storage strategy and kinda critical to this thing!?) . FWIW, the oVirt Disk system is a CentOS 6.2 system. 2nd attempt: I re-installed the nodes as Fedora 17 boxes and downgraded the kernels to 3.4.6-2. Then I connected these from the Engine (specifying the root pw) and watched the logs while things installed. After reboot neither of the servers were reachable. Sitting in front of the console, I realized that networking was refusing to start; several errors printed to the console looked like: device-mapper: table: 253:??: multipath: error getting device (I don't remember exactly what was after the "253:") calling "multipath -ll" yielded no output, calling "multipath -r" re-issued the above errors Obviously the Engine did a lot of work there, setting up the bridge, etc. I did not spend a long time trying to untangle this. (In retrospect, I will go back and probably spend more time trying to track this down, but it's difficult since I lose network & have to stand at the console in the server room :)) 3rd attempt: I re-installed the nodes with Fedora 17 and attempted to install VDSM manually by RPM. Despite following the instructions to turn off ssl (ssl=false in /etc/vdsm/vdsm.conf), I am seeing SSL "unknown cert" errors from the python socket server with every attempt of the engine to talk to the node. I added the CA from the master into the /etc/pki/vdsm (since that was the commented-out path in the config file as the trust store) and added the server's cert here too, but have no idea what form these files should take to be respected by the python server -- or if they are respected at all. I couldn't find this documented anywhere, so I left the servers spewing logs for the weekend figuring that I'll either give up or try another strategy on Monday. So is there a general strategy that should get me to a working system here? I suspect that the Node image is not a good path, since it appears to be incompatible with NFS mounting. The Fedora-17-installed-by-engine sounds good, but there's a lot of magic there & it obviously completely broke my systems. Is that where I should focus my efforts? Should I ditch NFS storage and just try to get something working with local-only storage on the nodes? (Shared storage would be a primary motivation for moving to ovirt, though.) I am very excited for this to work for me someday. I think it has been frustrating to have such sparse (or outdated?) documentation and such fundamental problems/bugs/configuration challenges. I'm using pretty standard (Dell) commodity servers (SATA drives, simple RAID setups, etc.). Sorry for no log output, I can provide more of that when back at work on Monday, but this was more of a general inquiry on where I should plan to take this. Thanks in advance! Hans