[Users] cpu stuck after joining fedora 18 node to ovirt

14 Jun 2013

      Dear all,
we're facing this issue while trying to join a Fedora 18 server with the 
latest stable vdsm to an instance of ovirt-engine 3.2.2.

Let's put things in context: we have this fujitsu rx200 server with a 
dual emulex fc nic connected with two fibres that go through two brocade 
switches, which in turn have four paths each to a vnx 5300 storage - 
that's a grand total of eight paths.

Now. This server was previously employed as an ovirt node and was 
working correctly with the very same configuration, we simply 
reinstalled the server with the latest Fedora 18 stable release. The 
server boots fine and multipathd is already up and running. All the 
paths are valid and block devices are seen. Please note that those block 
devices contain our ovirt 's d.c. storage domains.  We set up and enable 
ovirt's stable repo and proceed to install vdsmd, then we add the node 
from the webadmin interface. Bear in mind that we face the same issue by 
letting the engine install the vdsm daemon on its own. The node gets 
installed successfully and reboots.

When the OS tries to initialize the storage we get this funny message 
about a random core getting stuck for xx seconds. Then it fails to mount 
/home and finally drops to the emergency shell. (see attachment)

If we remove the host from the storage group and reboot it, it doesn't 
hang.

A few things we tried to fix this up:

* restored multipath.conf from a working node
* put each mountpoint's uuid inside fstab
* installed vdsm from fedora updates instead of the official ovirt stable
* swapped two different fujitsu models (we had the same issue with 
another server: we swapped them and had the first one join a rhev 
infrastructure running on the same network without shedding a single tear)

We consciously avoided installing ovirt node since we require more 
control out of our servers than such a minimal distribution will ever 
let us have, however we're willing to deploy it for troubleshooting 
purposes.

Any feedback will be more than welcome.

Cheers,
Daniele Pavia