I was able to get my hosts active. During the upgrade, by master data
domain's metadata was corrupted -- I had duplicates of some of the
dom_md files, and my metadata file was corrupt. Vdsm was looking at
that metadata file and throwing up its hands. I added a new data
domain but it couldn't take over as master because my old data domain
was messed up. I ended up creating a new metadata file in that domain,
and my hosts came up. I might be nice to have some way of resetting
corrupt metadata or at least of making the error clearer.
I did have a gluster hiccup during the upgrade -- the upgrade brought
my gluster version from 3.8 to 3.12, and the other peers in the
cluster refused connections from my first upgraded host. I upgraded
all the others, and got them all talking to each other again, but it
may have been during that time that my master data domain metadata
became corrupted. I haven't noticed any issues w/ my vms yet, and all
through the migration travail, I was able to keep 5 important VMs
running. They kept chugging away, even though their host and
surrounding hosts were unhealthy.
Anyway, I'm back ;)
Jason
On Thu, Dec 21, 2017 at 9:42 AM, Jason Brooks <jbrooks(a)redhat.com> wrote:
On Wed, Dec 20, 2017 at 11:47 PM, Sandro Bonazzola
<sbonazzo(a)redhat.com> wrote:
>
>
> 2017-12-21 4:26 GMT+01:00 Jason Brooks <jbrooks(a)redhat.com>:
>>
>> Hi all, I upgraded my 4 host converged gluster/ovirt lab setup to 4.2
>> yesterday, and now 3 of my hosts won't connect to my main data domain,
>> so they're non-operational when I try to activate them.
>>
>> Here's what seems like a relevant passage of vdsm.log:
>>
https://paste.fedoraproject.org/paste/JZuxul6-HZjjl8uHzgqL-w
>
>
>
> Adding some relevant developers.
> Jason, do you mind opening a bug on
>
https://bugzilla.redhat.com/enter_bug.cgi?product=vdsm to track this?
I filed an issue here:
https://bugzilla.redhat.com/show_bug.cgi?id=1528391