
I was able to get my hosts active. During the upgrade, by master data domain's metadata was corrupted -- I had duplicates of some of the dom_md files, and my metadata file was corrupt. Vdsm was looking at that metadata file and throwing up its hands. I added a new data domain but it couldn't take over as master because my old data domain was messed up. I ended up creating a new metadata file in that domain, and my hosts came up. I might be nice to have some way of resetting corrupt metadata or at least of making the error clearer. I did have a gluster hiccup during the upgrade -- the upgrade brought my gluster version from 3.8 to 3.12, and the other peers in the cluster refused connections from my first upgraded host. I upgraded all the others, and got them all talking to each other again, but it may have been during that time that my master data domain metadata became corrupted. I haven't noticed any issues w/ my vms yet, and all through the migration travail, I was able to keep 5 important VMs running. They kept chugging away, even though their host and surrounding hosts were unhealthy. Anyway, I'm back ;) Jason On Thu, Dec 21, 2017 at 9:42 AM, Jason Brooks <jbrooks@redhat.com> wrote:
On Wed, Dec 20, 2017 at 11:47 PM, Sandro Bonazzola <sbonazzo@redhat.com> wrote:
2017-12-21 4:26 GMT+01:00 Jason Brooks <jbrooks@redhat.com>:
Hi all, I upgraded my 4 host converged gluster/ovirt lab setup to 4.2 yesterday, and now 3 of my hosts won't connect to my main data domain, so they're non-operational when I try to activate them.
Here's what seems like a relevant passage of vdsm.log: https://paste.fedoraproject.org/paste/JZuxul6-HZjjl8uHzgqL-w
Adding some relevant developers. Jason, do you mind opening a bug on https://bugzilla.redhat.com/enter_bug.cgi?product=vdsm to track this?
I filed an issue here: https://bugzilla.redhat.com/show_bug.cgi?id=1528391