Dedicated NICs for gluster network

older
Suddenly all VM's down including...

Nicolas Ecarnot

27 Nov 2015 27 Nov '15

5:02 p.m.

Hello, [Here : oVirt 3.5.3, 3 x CentOS 7.0 hosts with replica-3 gluster SD on the hosts]. On the switchs, I have created a dedicated VLAN to isolate the glusterFS traffic, but I'm not using it yet. I was thinking of creating a dedicated IP for each node's gluster NIC, and a DNS record by the way ("my_nodes_name_GL"), but I fear using this hostname or this ip in oVirt GUI host network interface tab, leading oVirt think this is a different host. Not being sure this fear is clearly described, let's say : - On each node, I create a second ip+(dns record in the soa) used by gluster, plugged on the correct VLAN - in oVirt gui, in the host network setting tab, the interface will be seen, with its ip, but reverse-dns-related to a different hostname. Here, I fear oVirt might check this reverse DNS and declare this NIC belongs to another host. I would also prefer not use a reverse pointing to the name of the host management ip, as this is evil and I'm a good guy. On your side, how do you cope with a dedicated storage network in case of storage+compute mixed hosts? -- Nicolas ECARNOT

Show replies by date

Nathanaël Blanchet

27 Nov 27 Nov

5:47 p.m.

Hello Nicolas, Did you have a look to this : http://www.ovirt.org/Features/Select_Network_For_Gluster ? But it is only available from >=3.6... Le 27/11/2015 17:02, Nicolas Ecarnot a écrit :

...

Hello,

[Here : oVirt 3.5.3, 3 x CentOS 7.0 hosts with replica-3 gluster SD on the hosts].

On the switchs, I have created a dedicated VLAN to isolate the glusterFS traffic, but I'm not using it yet. I was thinking of creating a dedicated IP for each node's gluster NIC, and a DNS record by the way ("my_nodes_name_GL"), but I fear using this hostname or this ip in oVirt GUI host network interface tab, leading oVirt think this is a different host.

Not being sure this fear is clearly described, let's say : - On each node, I create a second ip+(dns record in the soa) used by gluster, plugged on the correct VLAN - in oVirt gui, in the host network setting tab, the interface will be seen, with its ip, but reverse-dns-related to a different hostname. Here, I fear oVirt might check this reverse DNS and declare this NIC belongs to another host.

I would also prefer not use a reverse pointing to the name of the host management ip, as this is evil and I'm a good guy.

On your side, how do you cope with a dedicated storage network in case of storage+compute mixed hosts?

-- Nathanaël Blanchet Supervision réseau Pôle Infrastrutures Informatiques 227 avenue Professeur-Jean-Louis-Viala 34193 MONTPELLIER CEDEX 5 Tél. 33 (0)4 67 54 84 55 Fax 33 (0)4 67 54 84 14 blanchet@abes.fr

Ivan Bulatovic

8 p.m.

Hi Nicolas, what works for me in 3.6 is creating a new network for gluster within oVirt, marking it for gluster use only, optionally setting bonded interface upon NIC's that are dedicated for gluster traffic and providing it with an IP address without configuring a gateway, and then modifying /etc/hosts so that hostnames are resolvable between nodes. Every node should have two hostnames, one for ovirtmgmt network that is resolvable via DNS (or via /etc/hosts), and the other for gluster network that is resolvable purely via /etc/hosts (every node should contain entries for themselves and for each gluster node). Peers should be probed via their gluster hostnames, while ensuring that gluster peer status contains only addresses and hostnames that are dedicated for gluster on each node. Same goes for adding bricks, creating a volume etc. This way, no communication (except gluster one) should be allowed through gluster dedicated vlan. To be on the safe side, we can also force gluster to listen only on dedicated interfaces via transport.socket.bind-address option (haven't tried this one, will do). Separation of gluster (or in the future any storage network), live migration network, vm and management network is always a good thing. Perhaps, we could manage failover of those networks within oVirt, ie. in case lm network is down - use gluster network for lm and vice versa. Cool candidate for an RFE, but first we need this supported within gluster itself. This may prove useful when there is not enough NIC's available to do a bond beneath every defined network. But we can still separate traffic and provide failover by selecting multiple networks without actually doing any load balancing between the two. As Nathanaël mentioned, marking network for gluster use is only available in 3.6. I'm also interested if there is a better way around this procedure, or perharps enhancing it. Kind regards, Ivan On 11/27/2015 05:47 PM, Nathanaël Blanchet wrote:

...

Hello Nicolas,

Did you have a look to this : http://www.ovirt.org/Features/Select_Network_For_Gluster ? But it is only available from >=3.6...

Le 27/11/2015 17:02, Nicolas Ecarnot a écrit :

...
Hello,

[Here : oVirt 3.5.3, 3 x CentOS 7.0 hosts with replica-3 gluster SD on the hosts].

On the switchs, I have created a dedicated VLAN to isolate the glusterFS traffic, but I'm not using it yet. I was thinking of creating a dedicated IP for each node's gluster NIC, and a DNS record by the way ("my_nodes_name_GL"), but I fear using this hostname or this ip in oVirt GUI host network interface tab, leading oVirt think this is a different host.

Not being sure this fear is clearly described, let's say : - On each node, I create a second ip+(dns record in the soa) used by gluster, plugged on the correct VLAN - in oVirt gui, in the host network setting tab, the interface will be seen, with its ip, but reverse-dns-related to a different hostname. Here, I fear oVirt might check this reverse DNS and declare this NIC belongs to another host.

I would also prefer not use a reverse pointing to the name of the host management ip, as this is evil and I'm a good guy.

On your side, how do you cope with a dedicated storage network in case of storage+compute mixed hosts?

Nicolas Ecarnot

30 Dec 30 Dec

9:55 p.m.

Hi Ivan, Nathanaël and everyone, Today, I finally upgraded this oVirt 3.5.2 into 3.6.1, and I saw appearing the gluster network in the networks settings. I'm not using it yet because I still have questions. I think I perfectly understood your answer, but there are points I'd like to know better : - I'm not sure that using DNS or /etc/hosts is making any difference. Once every dedicated NIC is using a dedicated ip address, with the dedicated DNS-record + PTR, I see no benefit to use /etc/hosts... ? - The oVirt setup I'm working on in this discussion is an storage+compute setup (oVirt+gluster on the same hosts). That means that my oVirt nodes are both clients and servers of themselves - data storage speaking. I'm worrying about not specifying any gateway for the gluster network : * How will all my hosts - as gluster clients - be able to find a route towards the gluster network (gluster servers)? * How will my guest - my VMs as gluster clients - be able to find a route towards the gluster network (indeed, sometimes also provided that way!) ? - Obviously, all this wouldn't be that fun if is wasn't already in production (well OK, QA), and thus I have to change the ip of my gluster nodes (https://www.gluster.org/pipermail/gluster-users/2014-May/017328.html). -- Nicolas ECARNOT Le 27/11/2015 20:00, Ivan Bulatovic a écrit :

...

Hi Nicolas,

what works for me in 3.6 is creating a new network for gluster within oVirt, marking it for gluster use only, optionally setting bonded interface upon NIC's that are dedicated for gluster traffic and providing it with an IP address without configuring a gateway, and then modifying /etc/hosts so that hostnames are resolvable between nodes. Every node should have two hostnames, one for ovirtmgmt network that is resolvable via DNS (or via /etc/hosts), and the other for gluster network that is resolvable purely via /etc/hosts (every node should contain entries for themselves and for each gluster node).

Peers should be probed via their gluster hostnames, while ensuring that gluster peer status contains only addresses and hostnames that are dedicated for gluster on each node. Same goes for adding bricks, creating a volume etc.

This way, no communication (except gluster one) should be allowed through gluster dedicated vlan. To be on the safe side, we can also force gluster to listen only on dedicated interfaces via transport.socket.bind-address option (haven't tried this one, will do).

Separation of gluster (or in the future any storage network), live migration network, vm and management network is always a good thing. Perhaps, we could manage failover of those networks within oVirt, ie. in case lm network is down - use gluster network for lm and vice versa. Cool candidate for an RFE, but first we need this supported within gluster itself. This may prove useful when there is not enough NIC's available to do a bond beneath every defined network. But we can still separate traffic and provide failover by selecting multiple networks without actually doing any load balancing between the two.

As Nathanaël mentioned, marking network for gluster use is only available in 3.6. I'm also interested if there is a better way around this procedure, or perharps enhancing it.

Kind regards,

Ivan

On 11/27/2015 05:47 PM, Nathanaël Blanchet wrote:

...
Hello Nicolas,

Did you have a look to this : http://www.ovirt.org/Features/Select_Network_For_Gluster ? But it is only available from >=3.6...

Le 27/11/2015 17:02, Nicolas Ecarnot a écrit :

...
Hello,

[Here : oVirt 3.5.3, 3 x CentOS 7.0 hosts with replica-3 gluster SD on the hosts].

On the switchs, I have created a dedicated VLAN to isolate the glusterFS traffic, but I'm not using it yet. I was thinking of creating a dedicated IP for each node's gluster NIC, and a DNS record by the way ("my_nodes_name_GL"), but I fear using this hostname or this ip in oVirt GUI host network interface tab, leading oVirt think this is a different host.

Not being sure this fear is clearly described, let's say : - On each node, I create a second ip+(dns record in the soa) used by gluster, plugged on the correct VLAN - in oVirt gui, in the host network setting tab, the interface will be seen, with its ip, but reverse-dns-related to a different hostname. Here, I fear oVirt might check this reverse DNS and declare this NIC belongs to another host.

I would also prefer not use a reverse pointing to the name of the host management ip, as this is evil and I'm a good guy.

On your side, how do you cope with a dedicated storage network in case of storage+compute mixed hosts?

-- Nicolas ECARNOT

Nicolas Ecarnot

19 Aug 19 Aug

8:59 a.m.

Hello, I'm digging out this thread because I now had the time to work on this subject, and I'm stuck. This oVirt setup has a standalone engine, and 3 hosts. These 3 hosts are hypervisors and gluster nodes, each using one NIC for all the traffic, that is a very bad idea. (Well, it's working, but not recommended). I added 3 OTHER nodes, and so far, I only created the gluster setup and created a replica-3 volume. Each of these new nodes now have one NIC for management, one NIC for gluster, and other NICs for other things. Each NIC has an IP + DNS name in its dedicated VLAN : one for mgmt and one for gluster. The mgmt subnet is routed, though the gluster subnet is not. Every node can ping each other, either using the mgmt or the gluster subnets. The creation of the gluster subnet and volume went very well and seems to be perfect. Now, in the oVirt web gui, I'm trying to add these nodes as oVirt hosts. I'm using their mgmt DNS names, and I'm getting : "Error while executing action: Server xxxxxxxx is already part of another cluster." I found no idea when googling, except something related to gluster (you bet!), telling this may be related to the fact that there is already a volume, managed with a different name. Obviously, using a different name and IP is what I needed! I used "transport.socket.bind-address" to make sure the gluster traffic will only use the dedicated NICs. Well, I also tried to created a storage domain relying on the freshly created gluster volume, but as this subnet is not routed, it is not reachable from the manager nor the existing SPM. I'm feeling I'm missing something here, so your help is warmly welcome. Nicolas ECARNOT PS : CentOS 7.2 everywhere, oVirt 3.6.7 Le 27/11/2015 à 20:00, Ivan Bulatovic a écrit :

...

Hi Nicolas,

what works for me in 3.6 is creating a new network for gluster within oVirt, marking it for gluster use only, optionally setting bonded interface upon NIC's that are dedicated for gluster traffic and providing it with an IP address without configuring a gateway, and then modifying /etc/hosts so that hostnames are resolvable between nodes. Every node should have two hostnames, one for ovirtmgmt network that is resolvable via DNS (or via /etc/hosts), and the other for gluster network that is resolvable purely via /etc/hosts (every node should contain entries for themselves and for each gluster node).

Peers should be probed via their gluster hostnames, while ensuring that gluster peer status contains only addresses and hostnames that are dedicated for gluster on each node. Same goes for adding bricks, creating a volume etc.

This way, no communication (except gluster one) should be allowed through gluster dedicated vlan. To be on the safe side, we can also force gluster to listen only on dedicated interfaces via transport.socket.bind-address option (haven't tried this one, will do).

Separation of gluster (or in the future any storage network), live migration network, vm and management network is always a good thing. Perhaps, we could manage failover of those networks within oVirt, ie. in case lm network is down - use gluster network for lm and vice versa. Cool candidate for an RFE, but first we need this supported within gluster itself. This may prove useful when there is not enough NIC's available to do a bond beneath every defined network. But we can still separate traffic and provide failover by selecting multiple networks without actually doing any load balancing between the two.

As Nathanaël mentioned, marking network for gluster use is only available in 3.6. I'm also interested if there is a better way around this procedure, or perharps enhancing it.

Kind regards,

Ivan

On 11/27/2015 05:47 PM, Nathanaël Blanchet wrote:

...
Hello Nicolas,

Did you have a look to this : http://www.ovirt.org/Features/Select_Network_For_Gluster ? But it is only available from >=3.6...

Le 27/11/2015 17:02, Nicolas Ecarnot a écrit :

...
Hello,

[Here : oVirt 3.5.3, 3 x CentOS 7.0 hosts with replica-3 gluster SD on the hosts].

On the switchs, I have created a dedicated VLAN to isolate the glusterFS traffic, but I'm not using it yet. I was thinking of creating a dedicated IP for each node's gluster NIC, and a DNS record by the way ("my_nodes_name_GL"), but I fear using this hostname or this ip in oVirt GUI host network interface tab, leading oVirt think this is a different host.

Not being sure this fear is clearly described, let's say : - On each node, I create a second ip+(dns record in the soa) used by gluster, plugged on the correct VLAN - in oVirt gui, in the host network setting tab, the interface will be seen, with its ip, but reverse-dns-related to a different hostname. Here, I fear oVirt might check this reverse DNS and declare this NIC belongs to another host.

I would also prefer not use a reverse pointing to the name of the host management ip, as this is evil and I'm a good guy.

On your side, how do you cope with a dedicated storage network in case of storage+compute mixed hosts?

-- Nicolas ECARNOT

Sahina Bose

9:55 a.m.

On Fri, Aug 19, 2016 at 12:29 PM, Nicolas Ecarnot <nicolas@ecarnot.net> wrote:

...

Hello,

I'm digging out this thread because I now had the time to work on this subject, and I'm stuck.

This oVirt setup has a standalone engine, and 3 hosts. These 3 hosts are hypervisors and gluster nodes, each using one NIC for all the traffic, that is a very bad idea. (Well, it's working, but not recommended).

I added 3 OTHER nodes, and so far, I only created the gluster setup and created a replica-3 volume. Each of these new nodes now have one NIC for management, one NIC for gluster, and other NICs for other things. Each NIC has an IP + DNS name in its dedicated VLAN : one for mgmt and one for gluster. The mgmt subnet is routed, though the gluster subnet is not. Every node can ping each other, either using the mgmt or the gluster subnets.

The creation of the gluster subnet and volume went very well and seems to be perfect.

Now, in the oVirt web gui, I'm trying to add these nodes as oVirt hosts. I'm using their mgmt DNS names, and I'm getting : "Error while executing action: Server xxxxxxxx is already part of another cluster."

Did you peer probe the gluster cluster prior to adding the nodes to oVirt? What's the output of "gluster peer status" If I understand correctly: node1 - mgmt.ip.1 & gluster.ip.1 node2 - mgmt.ip.2 & gluster.ip.2 node3 - mgmt.ip.3 & gluster.ip.3 Did you create a network and assign "gluster" role to it in the cluster? Were you able to add the first node to cluster, and got this error on second node addition ?

...

From the error, it looks like oVirt does not understand the peer list returned from gluster is a match with node being added. Please provide the log snippet of the failure (from engine.log as well as vdsm.log on node)

...

I found no idea when googling, except something related to gluster (you bet!), telling this may be related to the fact that there is already a volume, managed with a different name.

Obviously, using a different name and IP is what I needed! I used "transport.socket.bind-address" to make sure the gluster traffic will only use the dedicated NICs.

Well, I also tried to created a storage domain relying on the freshly created gluster volume, but as this subnet is not routed, it is not reachable from the manager nor the existing SPM.

The existing SPM - isn't it one of the the 3 new nodes being added? Or are you adding the 3 nodes to your existing cluster? If so, I suggest you try adding this to a new cluster

...

I'm feeling I'm missing something here, so your help is warmly welcome.

Nicolas ECARNOT

PS : CentOS 7.2 everywhere, oVirt 3.6.7

Le 27/11/2015 à 20:00, Ivan Bulatovic a écrit :

...
Hi Nicolas,

what works for me in 3.6 is creating a new network for gluster within oVirt, marking it for gluster use only, optionally setting bonded interface upon NIC's that are dedicated for gluster traffic and providing it with an IP address without configuring a gateway, and then modifying /etc/hosts so that hostnames are resolvable between nodes. Every node should have two hostnames, one for ovirtmgmt network that is resolvable via DNS (or via /etc/hosts), and the other for gluster network that is resolvable purely via /etc/hosts (every node should contain entries for themselves and for each gluster node).

Peers should be probed via their gluster hostnames, while ensuring that gluster peer status contains only addresses and hostnames that are dedicated for gluster on each node. Same goes for adding bricks, creating a volume etc.

This way, no communication (except gluster one) should be allowed through gluster dedicated vlan. To be on the safe side, we can also force gluster to listen only on dedicated interfaces via transport.socket.bind-address option (haven't tried this one, will do).

Separation of gluster (or in the future any storage network), live migration network, vm and management network is always a good thing. Perhaps, we could manage failover of those networks within oVirt, ie. in case lm network is down - use gluster network for lm and vice versa. Cool candidate for an RFE, but first we need this supported within gluster itself. This may prove useful when there is not enough NIC's available to do a bond beneath every defined network. But we can still separate traffic and provide failover by selecting multiple networks without actually doing any load balancing between the two.

As Nathanaël mentioned, marking network for gluster use is only available in 3.6. I'm also interested if there is a better way around this procedure, or perharps enhancing it.

Kind regards,

Ivan

On 11/27/2015 05:47 PM, Nathanaël Blanchet wrote:

...
Hello Nicolas,

Did you have a look to this : http://www.ovirt.org/Features/Select_Network_For_Gluster ? But it is only available from >=3.6...

Le 27/11/2015 17:02, Nicolas Ecarnot a écrit :

...
Hello,

[Here : oVirt 3.5.3, 3 x CentOS 7.0 hosts with replica-3 gluster SD on the hosts].

On the switchs, I have created a dedicated VLAN to isolate the glusterFS traffic, but I'm not using it yet. I was thinking of creating a dedicated IP for each node's gluster NIC, and a DNS record by the way ("my_nodes_name_GL"), but I fear using this hostname or this ip in oVirt GUI host network interface tab, leading oVirt think this is a different host.

Not being sure this fear is clearly described, let's say : - On each node, I create a second ip+(dns record in the soa) used by gluster, plugged on the correct VLAN - in oVirt gui, in the host network setting tab, the interface will be seen, with its ip, but reverse-dns-related to a different hostname. Here, I fear oVirt might check this reverse DNS and declare this NIC belongs to another host.

I would also prefer not use a reverse pointing to the name of the host management ip, as this is evil and I'm a good guy.

On your side, how do you cope with a dedicated storage network in case of storage+compute mixed hosts?

-- Nicolas ECARNOT

_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Nicolas Ecarnot

11:03 a.m.

Le 19/08/2016 à 09:55, Sahina Bose a écrit :

...

On Fri, Aug 19, 2016 at 12:29 PM, Nicolas Ecarnot <nicolas@ecarnot.net <mailto:nicolas@ecarnot.net>> wrote:

Hello,

I'm digging out this thread because I now had the time to work on this subject, and I'm stuck.

This oVirt setup has a standalone engine, and 3 hosts. These 3 hosts are hypervisors and gluster nodes, each using one NIC for all the traffic, that is a very bad idea. (Well, it's working, but not recommended).

I added 3 OTHER nodes, and so far, I only created the gluster setup and created a replica-3 volume. Each of these new nodes now have one NIC for management, one NIC for gluster, and other NICs for other things. Each NIC has an IP + DNS name in its dedicated VLAN : one for mgmt and one for gluster. The mgmt subnet is routed, though the gluster subnet is not. Every node can ping each other, either using the mgmt or the gluster subnets.

The creation of the gluster subnet and volume went very well and seems to be perfect.

Now, in the oVirt web gui, I'm trying to add these nodes as oVirt hosts. I'm using their mgmt DNS names, and I'm getting : "Error while executing action: Server xxxxxxxx is already part of another cluster."

Did you peer probe the gluster cluster prior to adding the nodes to oVirt?

Yes, and using their "gluster subnet" names. It went fine.

...

What's the output of "gluster peer status"

[root@serv-vm-al04 log]# gluster peer status Number of Peers: 2 Hostname: serv-vm-al05-data.sdis.isere.fr Uuid: eddb3c6d-2e98-45ca-bd1f-6d2153bbb60e State: Peer in Cluster (Connected) Hostname: serv-vm-al06-data.sdis.isere.fr Uuid: cafefdf3-ffc3-4589-abf6-6ca76905593b State: Peer in Cluster (Connected) On the two other nodes, the same command output is OK.

...

If I understand correctly: node1 - mgmt.ip.1 & gluster.ip.1 node2 - mgmt.ip.2 & gluster.ip.2 node3 - mgmt.ip.3 & gluster.ip.3

Right

...

Did you create a network and assign "gluster" role to it in the cluster?

...

Were you able to add the first node to cluster No , and got this error on second node addition ? I had the error when trying to add the first node. From the error, it looks like oVirt does not understand the peer list returned from gluster is a match with node being added. Sounds correct Please provide the log snippet of the failure (from engine.log as well as vdsm.log on node) See attached file

I found no idea when googling, except something related to gluster (you bet!), telling this may be related to the fact that there is already a volume, managed with a different name.

Obviously, using a different name and IP is what I needed! I used "transport.socket.bind-address" to make sure the gluster traffic will only use the dedicated NICs.

Well, I also tried to created a storage domain relying on the freshly created gluster volume, but as this subnet is not routed, it is not reachable from the manager nor the existing SPM.

The existing SPM - isn't it one of the the 3 new nodes being added? No, the SPM is one of the 3 former and still existing hosts. Or are you adding the 3 nodes to your existing cluster? If so, I suggest you try adding this to a new cluster OK, I tried and succeed to create a new cluster. In this new cluster, I was ABLE to add the first new host, using its mgmt DNS name. This first host still has to have its NICs configured, and (using Chrome or FF) the access to the network settings window is stalling the browser (I tried to restart even the engine, to no avail). Thus, I can not setup

I created a gluster network, but did not assign the gluster role so far, as my former 3 hosts had no dedicated NIC not ip for that. I planned to assign this role once my 3 new hosts were part of the game. this first node NICs. Thus, I can not add any further host because oVirt relies on a first host to validate the further ones. -- Nicolas ECARNOT

Sahina Bose

1:43 p.m.

On Fri, Aug 19, 2016 at 2:33 PM, Nicolas Ecarnot <nicolas@ecarnot.net> wrote:

...

Le 19/08/2016 à 09:55, Sahina Bose a écrit :

On Fri, Aug 19, 2016 at 12:29 PM, Nicolas Ecarnot <nicolas@ecarnot.net> wrote:

...
Hello,

I'm digging out this thread because I now had the time to work on this subject, and I'm stuck.

This oVirt setup has a standalone engine, and 3 hosts. These 3 hosts are hypervisors and gluster nodes, each using one NIC for all the traffic, that is a very bad idea. (Well, it's working, but not recommended).

I added 3 OTHER nodes, and so far, I only created the gluster setup and created a replica-3 volume. Each of these new nodes now have one NIC for management, one NIC for gluster, and other NICs for other things. Each NIC has an IP + DNS name in its dedicated VLAN : one for mgmt and one for gluster. The mgmt subnet is routed, though the gluster subnet is not. Every node can ping each other, either using the mgmt or the gluster subnets.

The creation of the gluster subnet and volume went very well and seems to be perfect.

Now, in the oVirt web gui, I'm trying to add these nodes as oVirt hosts. I'm using their mgmt DNS names, and I'm getting : "Error while executing action: Server xxxxxxxx is already part of another cluster."

Did you peer probe the gluster cluster prior to adding the nodes to oVirt?

Yes, and using their "gluster subnet" names. It went fine.

What's the output of "gluster peer status"

[root@serv-vm-al04 log]# gluster peer status

Number of Peers: 2

Hostname: serv-vm-al05-data.sdis.isere.fr

Uuid: eddb3c6d-2e98-45ca-bd1f-6d2153bbb60e

State: Peer in Cluster (Connected)

Hostname: serv-vm-al06-data.sdis.isere.fr

Uuid: cafefdf3-ffc3-4589-abf6-6ca76905593b

State: Peer in Cluster (Connected)

On the two other nodes, the same command output is OK.

...

If I understand correctly: node1 - mgmt.ip.1 & gluster.ip.1 node2 - mgmt.ip.2 & gluster.ip.2 node3 - mgmt.ip.3 & gluster.ip.3

Right

Did you create a network and assign "gluster" role to it in the cluster?

I created a gluster network, but did not assign the gluster role so far, as my former 3 hosts had no dedicated NIC not ip for that. I planned to assign this role once my 3 new hosts were part of the game.

Were you able to add the first node to cluster

No

, and got this error on second node addition ?

I had the error when trying to add the first node.

From the error, it looks like oVirt does not understand the peer list returned from gluster is a match with node being added.

Sounds correct

Please provide the log snippet of the failure (from engine.log as well as vdsm.log on node)

See attached file

I couldn't view the attached log files for some reason, but the issue is that you're adding the node which is part of a cluster to an existing cluster. That will not work, even from gluster CLI

...

...
I found no idea when googling, except something related to gluster (you bet!), telling this may be related to the fact that there is already a volume, managed with a different name.

Obviously, using a different name and IP is what I needed! I used "transport.socket.bind-address" to make sure the gluster traffic will only use the dedicated NICs.

Well, I also tried to created a storage domain relying on the freshly created gluster volume, but as this subnet is not routed, it is not reachable from the manager nor the existing SPM.

The existing SPM - isn't it one of the the 3 new nodes being added?

No, the SPM is one of the 3 former and still existing hosts.

Or are you adding the 3 nodes to your existing cluster? If so, I suggest you try adding this to a new cluster

OK, I tried and succeed to create a new cluster. In this new cluster, I was ABLE to add the first new host, using its mgmt DNS name. This first host still has to have its NICs configured, and (using Chrome or FF) the access to the network settings window is stalling the browser (I tried to restart even the engine, to no avail). Thus, I can not setup this first node NICs.

Thus, I can not add any further host because oVirt relies on a first host to validate the further ones.

Network team should be able to help you here.

...

-- Nicolas ECARNOT

Nicolas Ecarnot

2:50 p.m.

This is a multi-part message in MIME format. --------------799007BC4F4CF3293454B115 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Le 19/08/2016 à 13:43, Sahina Bose a écrit :

...

...
Or are you adding the 3 nodes to your existing cluster? If so, I suggest you try adding this to a new cluster

OK, I tried and succeed to create a new cluster. In this new cluster, I was ABLE to add the first new host, using its mgmt DNS name. This first host still has to have its NICs configured, and (using Chrome or FF) the access to the network settings window is stalling the browser (I tried to restart even the engine, to no avail). Thus, I can not setup this first node NICs.

Thus, I can not add any further host because oVirt relies on a first host to validate the further ones.

Network team should be able to help you here.

OK, there were no mean I could continue this way (browser crash), so I tried and succeed doing so : - remove the newly created host and cluster - create a new DATACENTER - create a new cluster in this DC - add the first new host : OK - add the 2 other new hosts : OK Now, I can smoothly configure their NICs. Doing all this, I saw that oVirt detected there already was existing gluster cluster and volume, and integrated it in oVirt. Then, I was able to create a new storage domain in this new DC and cluster, using one of the *gluster* FQDN's host. It went nicely. BUT, when viewing the volume tab and brick details, the displayed brick names are the host DNS name, and NOT the host GLUSTER DNS names. I'm worrying about this, confirmed by what I read in the logs : 2016-08-19 14:46:30,484 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al04-data.sdis.isere.fr:/gluster/data/brick04 ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30' 2016-08-19 14:46:30,492 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al05-data.sdis.isere.fr:/gluster/data/brick04 ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30' 2016-08-19 14:46:30,500 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al06-data.sdis.isere.fr:/gluster/data/brick04 ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30' [oVirt shell (connected)]# list clusters id : 00000001-0001-0001-0001-000000000045 name : cluster51 description: Cluster d'alerte de test id : 1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30 name : cluster52 description: Cluster d'alerte de test [oVirt shell (connected)]# "cluster52" is the recent cluster, and I do have a dedicated gluster network, marked as gluster network, in the correct DC and cluster. The only point is that : - Each host has its name ("serv-vm-al04") and a second name for gluster ("serv-vm-al04-data"). - Using blahblahblah-data is correct on a gluster point of view - Maybe oVirt is disturb not to be able to ping the gluster FQDN (not routed) and then throwing this error? -- Nicolas ECARNOT --------------799007BC4F4CF3293454B115 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <div class="moz-cite-prefix">Le 19/08/2016 à 13:43, Sahina Bose a écrit :<br> </div> <blockquote cite="mid:CACjzOvf4_vEzdG=ZdzsD4Neddu8JdOWnL5YX-4jHLGVsgS2usA@mail.gmail.com" type="cite"> <div dir="ltr"><span class=""></span><br> <span class=""> </span> <div class="gmail_extra"> <div class="gmail_quote"> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <div bgcolor="#FFFFFF" text="#000000"><span class=""> <blockquote type="cite"> <div dir="ltr"> <div class="gmail_extra"> <div class="gmail_quote"> <div> Or are you adding the 3 nodes to your existing cluster? If so, I suggest you try adding this to a new cluster<br> </div> </div> </div> </div> </blockquote> </span> OK, I tried and succeed to create a new cluster.<br> In this new cluster, I was ABLE to add the first new host, using its mgmt DNS name.<br> This first host still has to have its NICs configured, and (using Chrome or FF) the access to the network settings window is stalling the browser (I tried to restart even the engine, to no avail). Thus, I can not setup this first node NICs.<br> <br> Thus, I can not add any further host because oVirt relies on a first host to validate the further ones.<span class=""><font color="#888888"><br> </font></span></div> </blockquote> <div><br> <br> </div> <div>Network team should be able to help you here.<br> </div> </div> <br> </div> </div> </blockquote> <br> OK, there were no mean I could continue this way (browser crash), so I tried and succeed doing so :<br> - remove the newly created host and cluster<br> - create a new DATACENTER<br> - create a new cluster in this DC<br> - add the first new host : OK<br> - add the 2 other new hosts : OK<br> <br> Now, I can smoothly configure their NICs.<br> <br> Doing all this, I saw that oVirt detected there already was existing gluster cluster and volume, and integrated it in oVirt.<br> <br> Then, I was able to create a new storage domain in this new DC and cluster, using one of the *gluster* FQDN's host. It went nicely.<br> <br> BUT, when viewing the volume tab and brick details, the displayed brick names are the host DNS name, and NOT the host GLUSTER DNS names.<br> <br> I'm worrying about this, confirmed by what I read in the logs :<br> <br> 2016-08-19 14:46:30,484 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al04-data.sdis.isere.fr:/gluster/data/brick04<br> ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30'<br> 2016-08-19 14:46:30,492 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al05-data.sdis.isere.fr:/gluster/data/brick04<br> ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30'<br> 2016-08-19 14:46:30,500 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al06-data.sdis.isere.fr:/gluster/data/brick04<br> ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30'<br> <br> [oVirt shell (connected)]# list clusters<br> <br> id : 00000001-0001-0001-0001-000000000045<br> name : cluster51<br> description: Cluster d'alerte de test<br> <br> id : 1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30<br> name : cluster52<br> description: Cluster d'alerte de test<br> <br> [oVirt shell (connected)]# <br> <br> "cluster52" is the recent cluster, and I do have a dedicated gluster network, marked as gluster network, in the correct DC and cluster.<br> The only point is that :<br> - Each host has its name ("serv-vm-al04") and a second name for gluster ("serv-vm-al04-data").<br> - Using blahblahblah-data is correct on a gluster point of view<br> - Maybe oVirt is disturb not to be able to ping the gluster FQDN (not routed) and then throwing this error?<br> <br> -- <br> Nicolas ECARNOT<br> </body> </html> --------------799007BC4F4CF3293454B115--

Sahina Bose

22 Aug 22 Aug

7:54 a.m.

On Fri, Aug 19, 2016 at 6:20 PM, Nicolas Ecarnot <nicolas@ecarnot.net> wrote:

...

Le 19/08/2016 à 13:43, Sahina Bose a écrit :

Or are you adding the 3 nodes to your existing cluster? If so, I suggest

...
you try adding this to a new cluster

OK, I tried and succeed to create a new cluster. In this new cluster, I was ABLE to add the first new host, using its mgmt DNS name. This first host still has to have its NICs configured, and (using Chrome or FF) the access to the network settings window is stalling the browser (I tried to restart even the engine, to no avail). Thus, I can not setup this first node NICs.

Thus, I can not add any further host because oVirt relies on a first host to validate the further ones.

Network team should be able to help you here.

OK, there were no mean I could continue this way (browser crash), so I tried and succeed doing so : - remove the newly created host and cluster - create a new DATACENTER - create a new cluster in this DC - add the first new host : OK - add the 2 other new hosts : OK

Now, I can smoothly configure their NICs.

Doing all this, I saw that oVirt detected there already was existing gluster cluster and volume, and integrated it in oVirt.

Then, I was able to create a new storage domain in this new DC and cluster, using one of the *gluster* FQDN's host. It went nicely.

BUT, when viewing the volume tab and brick details, the displayed brick names are the host DNS name, and NOT the host GLUSTER DNS names.

I'm worrying about this, confirmed by what I read in the logs :

2016-08-19 14:46:30,484 WARN [org.ovirt.engine.core.vdsbroker.gluster. GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al04-data.sdis.isere. fr:/gluster/data/brick04 ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb- 2f7ef3ed9f30' 2016-08-19 14:46:30,492 WARN [org.ovirt.engine.core.vdsbroker.gluster. GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al05-data.sdis.isere. fr:/gluster/data/brick04 ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb- 2f7ef3ed9f30' 2016-08-19 14:46:30,500 WARN [org.ovirt.engine.core.vdsbroker.gluster. GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al06-data.sdis.isere. fr:/gluster/data/brick04 ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb- 2f7ef3ed9f30'

[oVirt shell (connected)]# list clusters

id : 00000001-0001-0001-0001-000000000045 name : cluster51 description: Cluster d'alerte de test

id : 1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30 name : cluster52 description: Cluster d'alerte de test

[oVirt shell (connected)]#

"cluster52" is the recent cluster, and I do have a dedicated gluster network, marked as gluster network, in the correct DC and cluster. The only point is that : - Each host has its name ("serv-vm-al04") and a second name for gluster ("serv-vm-al04-data"). - Using blahblahblah-data is correct on a gluster point of view - Maybe oVirt is disturb not to be able to ping the gluster FQDN (not routed) and then throwing this error?

We do have a limitation currently that if you use multiple FQDNs, oVirt cannot associate it to the gluster brick correctly. This will be a problem only when you try brick management from oVirt - i.e try to remove or replace brick from oVirt. For monitoring brick status and detecting bricks - this is not an issue, and you can ignore the error in logs. Adding Ramesh who has a patch to fix this . --

...

Nicolas ECARNOT

Ramesh Nachimuthu

8:10 a.m.

This is a multi-part message in MIME format. --------------5AF6806598CC27E8D12E46F0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit On 08/22/2016 11:24 AM, Sahina Bose wrote:

...

On Fri, Aug 19, 2016 at 6:20 PM, Nicolas Ecarnot <nicolas@ecarnot.net <mailto:nicolas@ecarnot.net>> wrote:

Le 19/08/2016 à 13:43, Sahina Bose a écrit :

...
...
Or are you adding the 3 nodes to your existing cluster? If so, I suggest you try adding this to a new cluster

OK, I tried and succeed to create a new cluster. In this new cluster, I was ABLE to add the first new host, using its mgmt DNS name. This first host still has to have its NICs configured, and (using Chrome or FF) the access to the network settings window is stalling the browser (I tried to restart even the engine, to no avail). Thus, I can not setup this first node NICs.

Thus, I can not add any further host because oVirt relies on a first host to validate the further ones.

Network team should be able to help you here.

OK, there were no mean I could continue this way (browser crash), so I tried and succeed doing so : - remove the newly created host and cluster - create a new DATACENTER - create a new cluster in this DC - add the first new host : OK - add the 2 other new hosts : OK

Now, I can smoothly configure their NICs.

Doing all this, I saw that oVirt detected there already was existing gluster cluster and volume, and integrated it in oVirt.

Then, I was able to create a new storage domain in this new DC and cluster, using one of the *gluster* FQDN's host. It went nicely.

BUT, when viewing the volume tab and brick details, the displayed brick names are the host DNS name, and NOT the host GLUSTER DNS names.

I'm worrying about this, confirmed by what I read in the logs :

2016-08-19 14:46:30,484 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al04-data.sdis.isere.fr:/gluster/data/brick04 ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30' 2016-08-19 14:46:30,492 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al05-data.sdis.isere.fr:/gluster/data/brick04 ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30' 2016-08-19 14:46:30,500 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al06-data.sdis.isere.fr:/gluster/data/brick04 ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30'

[oVirt shell (connected)]# list clusters

id : 00000001-0001-0001-0001-000000000045 name : cluster51 description: Cluster d'alerte de test

id : 1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30 name : cluster52 description: Cluster d'alerte de test

[oVirt shell (connected)]#

"cluster52" is the recent cluster, and I do have a dedicated gluster network, marked as gluster network, in the correct DC and cluster. The only point is that : - Each host has its name ("serv-vm-al04") and a second name for gluster ("serv-vm-al04-data"). - Using blahblahblah-data is correct on a gluster point of view - Maybe oVirt is disturb not to be able to ping the gluster FQDN (not routed) and then throwing this error?

We do have a limitation currently that if you use multiple FQDNs, oVirt cannot associate it to the gluster brick correctly. This will be a problem only when you try brick management from oVirt - i.e try to remove or replace brick from oVirt. For monitoring brick status and detecting bricks - this is not an issue, and you can ignore the error in logs.

Adding Ramesh who has a patch to fix this .

Patch https://gerrit.ovirt.org/#/c/60083/ is posted to address this issue. But it will work only if the oVirt Engine can resolve FQDN *'serv-vm-al04-data.xx*'* to an IP address which is mapped to the gluster NIC (NIC with gluster network) on the host. Sahina: Can you review the patch :-) Regards, Ramesh

...

-- Nicolas ECARNOT

--------------5AF6806598CC27E8D12E46F0 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=windows-1252" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <p><br> </p> <br> <div class="moz-cite-prefix">On 08/22/2016 11:24 AM, Sahina Bose wrote:<br> </div> <blockquote cite="mid:CACjzOvfM4kju4oWrpw2Lxbc7Kf3-OU8J2ErkxrgBKmcYEZEh6Q@mail.gmail.com" type="cite"> <div dir="ltr"><br> <div class="gmail_extra"><br> <div class="gmail_quote">On Fri, Aug 19, 2016 at 6:20 PM, Nicolas Ecarnot <span dir="ltr"><<a moz-do-not-send="true" href="mailto:nicolas@ecarnot.net" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:nicolas@ecarnot.net">nicolas@ecarnot.net</a></a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div bgcolor="#FFFFFF" text="#000000"><span class=""> <div>Le 19/08/2016 à 13:43, Sahina Bose a écrit :<br> </div> <blockquote type="cite"> <div dir="ltr"><span></span><br> <span> </span> <div class="gmail_extra"> <div class="gmail_quote"> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <div bgcolor="#FFFFFF" text="#000000"><span> <blockquote type="cite"> <div dir="ltr"> <div class="gmail_extra"> <div class="gmail_quote"> <div> Or are you adding the 3 nodes to your existing cluster? If so, I suggest you try adding this to a new cluster<br> </div> </div> </div> </div> </blockquote> </span> OK, I tried and succeed to create a new cluster.<br> In this new cluster, I was ABLE to add the first new host, using its mgmt DNS name.<br> This first host still has to have its NICs configured, and (using Chrome or FF) the access to the network settings window is stalling the browser (I tried to restart even the engine, to no avail). Thus, I can not setup this first node NICs.<br> <br> Thus, I can not add any further host because oVirt relies on a first host to validate the further ones.<span><font color="#888888"><br> </font></span></div> </blockquote> <div><br> <br> </div> <div>Network team should be able to help you here.<br> </div> </div> <br> </div> </div> </blockquote> <br> </span> OK, there were no mean I could continue this way (browser crash), so I tried and succeed doing so :<br> - remove the newly created host and cluster<br> - create a new DATACENTER<br> - create a new cluster in this DC<br> - add the first new host : OK<br> - add the 2 other new hosts : OK<br> <br> Now, I can smoothly configure their NICs.<br> <br> Doing all this, I saw that oVirt detected there already was existing gluster cluster and volume, and integrated it in oVirt.<br> <br> Then, I was able to create a new storage domain in this new DC and cluster, using one of the *gluster* FQDN's host. It went nicely.<br> <br> BUT, when viewing the volume tab and brick details, the displayed brick names are the host DNS name, and NOT the host GLUSTER DNS names.<br> <br> I'm worrying about this, confirmed by what I read in the logs :<br> <br> 2016-08-19 14:46:30,484 WARN [org.ovirt.engine.core.<wbr>vdsbroker.gluster.<wbr>GlusterVolumesListReturnForXml<wbr>Rpc] (DefaultQuartzScheduler_<wbr>Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al04-data.sdis.isere.<wbr>fr:/gluster/data/brick04<br> ' of volume '35026521-e76e-4774-8ddf-<wbr>0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-<wbr>2f7ef3ed9f30'<br> 2016-08-19 14:46:30,492 WARN [org.ovirt.engine.core.<wbr>vdsbroker.gluster.<wbr>GlusterVolumesListReturnForXml<wbr>Rpc] (DefaultQuartzScheduler_<wbr>Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al05-data.sdis.isere.<wbr>fr:/gluster/data/brick04<br> ' of volume '35026521-e76e-4774-8ddf-<wbr>0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-<wbr>2f7ef3ed9f30'<br> 2016-08-19 14:46:30,500 WARN [org.ovirt.engine.core.<wbr>vdsbroker.gluster.<wbr>GlusterVolumesListReturnForXml<wbr>Rpc] (DefaultQuartzScheduler_<wbr>Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al06-data.sdis.isere.<wbr>fr:/gluster/data/brick04<br> ' of volume '35026521-e76e-4774-8ddf-<wbr>0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-<wbr>2f7ef3ed9f30'<br> <br> [oVirt shell (connected)]# list clusters<br> <br> id : 00000001-0001-0001-0001-<wbr>000000000045<br> name : cluster51<br> description: Cluster d'alerte de test<br> <br> id : 1c8e75a0-af3f-4e97-a8fb-<wbr>2f7ef3ed9f30<br> name : cluster52<br> description: Cluster d'alerte de test<br> <br> [oVirt shell (connected)]# <br> <br> "cluster52" is the recent cluster, and I do have a dedicated gluster network, marked as gluster network, in the correct DC and cluster.<br> The only point is that :<br> - Each host has its name ("serv-vm-al04") and a second name for gluster ("serv-vm-al04-data").<br> - Using blahblahblah-data is correct on a gluster point of view<br> - Maybe oVirt is disturb not to be able to ping the gluster FQDN (not routed) and then throwing this error?<span class="HOEnZb"><font color="#888888"><br> <br> </font></span></div> </blockquote> <div><br> </div> <div>We do have a limitation currently that if you use multiple FQDNs, oVirt cannot associate it to the gluster brick correctly. This will be a problem only when you try brick management from oVirt - i.e try to remove or replace brick from oVirt. For monitoring brick status and detecting bricks - this is not an issue, and you can ignore the error in logs.<br> <br> </div> <div>Adding Ramesh who has a patch to fix this .<br> </div> </div> </div> </div> </blockquote> <br> Patch <a href="https://gerrit.ovirt.org/#/c/60083/">https://gerrit.ovirt.org/#/c/60083/</a> is posted to address this issue. But it will work only if the oVirt Engine can resolve FQDN <b>'serv-vm-al04-data.xx*'</b> to an IP address which is mapped to the gluster NIC (NIC with gluster network) on the host.<br> <br> Sahina: Can you review the patch :-)<br> <br> Regards,<br> Ramesh<br> <br> <div bgcolor="#FFFFFF" text="#000000"><wbr></div> <blockquote cite="mid:CACjzOvfM4kju4oWrpw2Lxbc7Kf3-OU8J2ErkxrgBKmcYEZEh6Q@mail.gmail.com" type="cite"> <div dir="ltr"> <div class="gmail_extra"> <div class="gmail_quote"> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div bgcolor="#FFFFFF" text="#000000"><span class="HOEnZb"><font color="#888888"> -- <br> Nicolas ECARNOT<br> </font></span></div> </blockquote> </div> <br> </div> </div> </blockquote> <br> </body> </html> --------------5AF6806598CC27E8D12E46F0--

Nicolas Ecarnot

9:33 a.m.

This is a multi-part message in MIME format. --------------B0702B281B4003762AC9D2BF Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Le 22/08/2016 à 08:10, Ramesh Nachimuthu a écrit :

...

...
Now, I can smoothly configure their NICs.

Doing all this, I saw that oVirt detected there already was existing gluster cluster and volume, and integrated it in oVirt.

Then, I was able to create a new storage domain in this new DC and cluster, using one of the *gluster* FQDN's host. It went nicely.

BUT, when viewing the volume tab and brick details, the displayed brick names are the host DNS name, and NOT the host GLUSTER DNS names.

I'm worrying about this, confirmed by what I read in the logs :

2016-08-19 14:46:30,484 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al04-data.sdis.isere.fr:/gluster/data/brick04 ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30' 2016-08-19 14:46:30,492 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al05-data.sdis.isere.fr:/gluster/data/brick04 ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30' 2016-08-19 14:46:30,500 WARN [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al06-data.sdis.isere.fr:/gluster/data/brick04 ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30'

[oVirt shell (connected)]# list clusters

id : 00000001-0001-0001-0001-000000000045 name : cluster51 description: Cluster d'alerte de test

id : 1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30 name : cluster52 description: Cluster d'alerte de test

[oVirt shell (connected)]#

"cluster52" is the recent cluster, and I do have a dedicated gluster network, marked as gluster network, in the correct DC and cluster. The only point is that : - Each host has its name ("serv-vm-al04") and a second name for gluster ("serv-vm-al04-data"). - Using blahblahblah-data is correct on a gluster point of view - Maybe oVirt is disturb not to be able to ping the gluster FQDN (not routed) and then throwing this error?

We do have a limitation currently that if you use multiple FQDNs, oVirt cannot associate it to the gluster brick correctly. This will be a problem only when you try brick management from oVirt - i.e try to remove or replace brick from oVirt. For monitoring brick status and detecting bricks - this is not an issue, and you can ignore the error in logs.

Hi Sahina and Ramesh, what you wrote looks a lot the same at what I witnessed ("oVirt cannot associate it to the gluster brick correctly") : oVirt is trying to associate, and succeed, but using the host FQDN, and not the host gluster FQDN. That leads to a situation where oVirt is seeing the volume correctly (name, number of bricks), but : - I can not add nor manage the bricks, as you wrote it - the size is not reported - the bricks fqdn are not correct, as we just wrote it. At present, this is not very disturbing, but one major issue I witnessed twice was that : I tried to roughly reboot a host, which at this time was only used as a gluster node, and was not running any VM. I saw my complete oVirt DC crash in flames, maybe because of a STONITH storm (some host were power managed the hard way). I still have to reproduce this issue and provide you the log files, but before going further, please tell me if it's worth it on this 3.6.7 setup, or must I first upgrade to 4.xx ?

...

...
Adding Ramesh who has a patch to fix this .

Patch https://gerrit.ovirt.org/#/c/60083/ is posted to address this issue. But it will work only if the oVirt Engine can resolve FQDN *'serv-vm-al04-data.xx*'* to an IP address which is mapped to the gluster NIC (NIC with gluster network) on the host. -- Nicolas ECARNOT

--------------B0702B281B4003762AC9D2BF Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=windows-1252" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <div class="moz-cite-prefix">Le 22/08/2016 à 08:10, Ramesh Nachimuthu a écrit :<br> </div> <blockquote cite="mid:9b03bd60-2542-e51a-7392-a7e76bcc6a56@redhat.com" type="cite"> <meta content="text/html; charset=windows-1252" http-equiv="Content-Type"> <br> <blockquote cite="mid:CACjzOvfM4kju4oWrpw2Lxbc7Kf3-OU8J2ErkxrgBKmcYEZEh6Q@mail.gmail.com" type="cite"> <div dir="ltr"> <div class="gmail_extra"> <div class="gmail_quote"> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div bgcolor="#FFFFFF" text="#000000"> Now, I can smoothly configure their NICs.<br> <br> Doing all this, I saw that oVirt detected there already was existing gluster cluster and volume, and integrated it in oVirt.<br> <br> Then, I was able to create a new storage domain in this new DC and cluster, using one of the *gluster* FQDN's host. It went nicely.<br> <br> BUT, when viewing the volume tab and brick details, the displayed brick names are the host DNS name, and NOT the host GLUSTER DNS names.<br> <br> I'm worrying about this, confirmed by what I read in the logs :<br> <br> 2016-08-19 14:46:30,484 WARN [org.ovirt.engine.core.<wbr>vdsbroker.gluster.<wbr>GlusterVolumesListReturnForXml<wbr>Rpc] (DefaultQuartzScheduler_<wbr>Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al04-data.sdis.isere.<wbr>fr:/gluster/data/brick04<br> ' of volume '35026521-e76e-4774-8ddf-<wbr>0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-<wbr>2f7ef3ed9f30'<br> 2016-08-19 14:46:30,492 WARN [org.ovirt.engine.core.<wbr>vdsbroker.gluster.<wbr>GlusterVolumesListReturnForXml<wbr>Rpc] (DefaultQuartzScheduler_<wbr>Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al05-data.sdis.isere.<wbr>fr:/gluster/data/brick04<br> ' of volume '35026521-e76e-4774-8ddf-<wbr>0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-<wbr>2f7ef3ed9f30'<br> 2016-08-19 14:46:30,500 WARN [org.ovirt.engine.core.<wbr>vdsbroker.gluster.<wbr>GlusterVolumesListReturnForXml<wbr>Rpc] (DefaultQuartzScheduler_<wbr>Worker-100) [107dc2e3] Could not associate brick 'serv-vm-al06-data.sdis.isere.<wbr>fr:/gluster/data/brick04<br> ' of volume '35026521-e76e-4774-8ddf-<wbr>0a701b9eb40c' with correct network as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-<wbr>2f7ef3ed9f30'<br> <br> [oVirt shell (connected)]# list clusters<br> <br> id : 00000001-0001-0001-0001-<wbr>000000000045<br> name : cluster51<br> description: Cluster d'alerte de test<br> <br> id : 1c8e75a0-af3f-4e97-a8fb-<wbr>2f7ef3ed9f30<br> name : cluster52<br> description: Cluster d'alerte de test<br> <br> [oVirt shell (connected)]# <br> <br> "cluster52" is the recent cluster, and I do have a dedicated gluster network, marked as gluster network, in the correct DC and cluster.<br> The only point is that :<br> - Each host has its name ("serv-vm-al04") and a second name for gluster ("serv-vm-al04-data").<br> - Using blahblahblah-data is correct on a gluster point of view<br> - Maybe oVirt is disturb not to be able to ping the gluster FQDN (not routed) and then throwing this error?<span class="HOEnZb"><font color="#888888"><br> <br> </font></span></div> </blockquote> <div><br> </div> <div>We do have a limitation currently that if you use multiple FQDNs, oVirt cannot associate it to the gluster brick correctly. This will be a problem only when you try brick management from oVirt - i.e try to remove or replace brick from oVirt. For monitoring brick status and detecting bricks - this is not an issue, and you can ignore the error in logs.<br> </div> </div> </div> </div> </blockquote> </blockquote> <br> Hi Sahina and Ramesh,<br> <br> what you wrote looks a lot the same at what I witnessed ("oVirt cannot associate it to the gluster brick correctly") : oVirt is trying to associate, and succeed, but using the host FQDN, and not the host gluster FQDN.<br> That leads to a situation where oVirt is seeing the volume correctly (name, number of bricks), but :<br> - I can not add nor manage the bricks, as you wrote it<br> - the size is not reported<br> - the bricks fqdn are not correct, as we just wrote it.<br> <br> At present, this is not very disturbing, but one major issue I witnessed twice was that :<br> I tried to roughly reboot a host, which at this time was only used as a gluster node, and was not running any VM.<br> I saw my complete oVirt DC crash in flames, maybe because of a STONITH storm (some host were power managed the hard way).<br> I still have to reproduce this issue and provide you the log files, but before going further, please tell me if it's worth it on this 3.6.7 setup, or must I first upgrade to 4.xx ?<br> <br> <blockquote cite="mid:9b03bd60-2542-e51a-7392-a7e76bcc6a56@redhat.com" type="cite"> <blockquote cite="mid:CACjzOvfM4kju4oWrpw2Lxbc7Kf3-OU8J2ErkxrgBKmcYEZEh6Q@mail.gmail.com" type="cite"> <div dir="ltr"> <div class="gmail_extra"> <div class="gmail_quote"> <div> <br> </div> <div>Adding Ramesh who has a patch to fix this .<br> </div> </div> </div> </div> </blockquote> <br> Patch <a moz-do-not-send="true" href="https://gerrit.ovirt.org/#/c/60083/">https://gerrit.ovirt.org/#/c/60083/</a> is posted to address this issue. But it will work only if the oVirt Engine can resolve FQDN <b>'serv-vm-al04-data.xx*'</b> to an IP address which is mapped to the gluster NIC (NIC with gluster network) on the host.<br> </blockquote> -- <br> Nicolas ECARNOT<br> </body> </html> --------------B0702B281B4003762AC9D2BF--

3420

Age (days ago)

3689

Last active (days ago)

List overview

Download

11 comments

5 participants

participants (5)

Ivan Bulatovic
Nathanaël Blanchet
Nicolas Ecarnot
Ramesh Nachimuthu
Sahina Bose