Dear all,
we're facing this issue while trying to join a Fedora 18 server with the
latest stable vdsm to an instance of ovirt-engine 3.2.2.
Let's put things in context: we have this fujitsu rx200 server with a
dual emulex fc nic connected with two fibres that go through two brocade
switches, which in turn have four paths each to a vnx 5300 storage -
that's a grand total of eight paths.
Now. This server was previously employed as an ovirt node and was
working correctly with the very same configuration, we simply
reinstalled the server with the latest Fedora 18 stable release. The
server boots fine and multipathd is already up and running. All the
paths are valid and block devices are seen. Please note that those block
devices contain our ovirt 's d.c. storage domains. We set up and enable
ovirt's stable repo and proceed to install vdsmd, then we add the node
from the webadmin interface. Bear in mind that we face the same issue by
letting the engine install the vdsm daemon on its own. The node gets
installed successfully and reboots.
When the OS tries to initialize the storage we get this funny message
about a random core getting stuck for xx seconds. Then it fails to mount
/home and finally drops to the emergency shell. (see attachment)
If we remove the host from the storage group and reboot it, it doesn't
hang.
A few things we tried to fix this up:
* restored multipath.conf from a working node
* put each mountpoint's uuid inside fstab
* installed vdsm from fedora updates instead of the official ovirt stable
* swapped two different fujitsu models (we had the same issue with
another server: we swapped them and had the first one join a rhev
infrastructure running on the same network without shedding a single tear)
We consciously avoided installing ovirt node since we require more
control out of our servers than such a minimal distribution will ever
let us have, however we're willing to deploy it for troubleshooting
purposes.
Any feedback will be more than welcome.
Cheers,
Daniele Pavia