[Users] Clarify message: "Failed to connect Host to Storage Pool Default"

Omer Frenkel ofrenkel at redhat.com
Wed Aug 28 07:51:02 UTC 2013



----- Original Message -----
> From: "Gianluca Cecchi" <gianluca.cecchi at gmail.com>
> To: "Omer Frenkel" <ofrenkel at redhat.com>
> Cc: "users" <users at ovirt.org>
> Sent: Tuesday, August 27, 2013 4:57:45 PM
> Subject: Re: [Users] Clarify message: "Failed to connect Host to Storage Pool Default"
> 
> On Tue, Aug 27, 2013 at 3:14 PM, Omer Frenkel  wrote:
> >
> >
> > ----- Original Message -----
> >> From: "Gianluca Cecchi" <gianluca.cecchi at gmail.com>
> >> To: "users" <users at ovirt.org>
> >> Sent: Monday, August 26, 2013 6:44:39 PM
> >> Subject: [Users] Clarify message: "Failed to connect Host to Storage Pool
> >> Default"
> >>
> >> Hello,
> >> after a induced failure of a whole site for testing reaction and
> >> restart, what would be sequence of actions to do from a physical point
> >> of view and from a gui point of view after powering on the hw
> >> components?
> 
> [snip]
> 
> >> I'm going to eventually send full logs, but I would like to ask if it
> >> is possible to send clearer messages inside the gui, for example what
> >> are the SDs that the host cannot access in case there are many of
> >> them.
> >
> > afair, the SDs not logged in audit log since you might have 20 domains or
> > more,
> > and it would not look good, so full information is in the log,
> > and audit log just gives you a general information what is wrong.
> 
> OK. what about recording the first one (say SD1) and putting in "audit
> log" (does this term mean what displayed in web adin gui?) something
> like
> 
> "Host XXX cannot access at least storage domain SD1 attached to the
> Data Center Default. See logfile (which one? put the path in message)
> for full log. Setting Host state to Non-Operational."
> 
> Does this mean that if only one out of 20 SDs is not able to be
> reconnected all the DC is automatically impacted?
> 
> Questions:
> 1) suppose one out of 20 SDs is not able to be reconnected (hw failure
> caused by power fault)
> what are the steps to correct/acknowledge the failure and let at least
> start the VMs not depending on it in the mean time one analyzes the
> problem and resolves it?
> 

if only one (or few) domains are problematic then the dc should be able to recover to up state,
and only these domains will be in 'inactive' status.
vms that not depend on these should work ok.
this should happen automatically, no manual steps needed.

> 2) suppose that the particular faulty SD is the one that was the SD
> Master before crash, does this mean I am forced to use some db
> commands to switch it to an available SD or can I follow steps in 1)
> (if there are...) and another SD will be automatically "elected" as
> the new Master?
> 

no, there is a mechanism to change the master domain to some other available domain,
assuming there is one like this.
this is the "reconstruct master" that you see in the logs.

> >
> > sounds like error connecting to your storage
> 
> Yes, in my simulation I have an IBM DS6800 where I can formally reach
> the SAN disks from hosts but the TUR command configured in multipath
> fails (and for exampe the command "fdisk -l dev/sdb" where sdb is one
> disk on the san exits with error "invalid parameter" due to DS6800
> incorrect configuration)
> 
> Thanks in advance.
> 
> Gianluca
> 



More information about the Users mailing list