[Users] Will this two node concept scale and work?

Hello List, my aim is to host multiple VMs which are redundant and are high available. It should also scale well. I think usually people just buy a fat iSCSI Storage and attach this. In my case it should scale well from very small nodes to big ones. Therefore an iSCSI Target will bring a lot of overhead (10GBit Links and two Paths, and really i should have a 2nd Hot Standby SAN, too). This makes scalability very hard. This post is also not meant to be a iscsi discussion. Since oVirt does not support DRBD out of the box i came up with my own concept: http://oi62.tinypic.com/2550xg5.jpg As far as i can tell i have the following advantages: -------------------------------------------------------------------- - i can start with two simple cheap nodes - i could add more disks to my nodes. Maybe even a SSD as a dedicated drbdresource. - i can connect the two nodes directly to each other with bonding or infiniband. i dont need a switch or something between it. Downside: --------------- - i always need two nodes (as a couple) Will this setup work for me. So far i think i will be quite happy with it. Since the DRBD Resources are shared in dual primary mode i am not sure if ovirt can handle it. It is not allowed to write to a vm disk at the same time. The Concept of Linbit ( http://www.linbit.com/en/company/news/333-high-available-virtualization-at-a...) seems to much of an overhead with the iSCSI Layer and pacemaker setup. Its just too much for such a simple task. Please tell me that this concept is great and will work and scale well. Otherwise i am also thankful for any hints or critical ideas. Thanks a lot, Mario

see in line On 02/05/2014 10:45 AM, ml ml wrote:
Hello List,
my aim is to host multiple VMs which are redundant and are high available. It should also scale well.
I'm assuming you are talking about HA cluster since redundancy vm and HA vms are a contradiction :)
I think usually people just buy a fat iSCSI Storage and attach this. In my case it should scale well from very small nodes to big ones. Therefore an iSCSI Target will bring a lot of overhead (10GBit Links and two Paths, and really i should have a 2nd Hot Standby SAN, too). This makes scalability very hard.
This post is also not meant to be a iscsi discussion.
Since oVirt does not support DRBD out of the box i came up with my own concept:
check out posix storage domain. If it supports gluster you might be able to use it for DRBD.
http://oi62.tinypic.com/2550xg5.jpg
As far as i can tell i have the following advantages: -------------------------------------------------------------------- - i can start with two simple cheap nodes - i could add more disks to my nodes. Maybe even a SSD as a dedicated drbd resource. - i can connect the two nodes directly to each other with bonding or infiniband. i dont need a switch or something between it.
Downside: --------------- - i always need two nodes (as a couple)
Will this setup work for me. So far i think i will be quite happy with it. Since the DRBD Resources are shared in dual primary mode i am not sure if ovirt can handle it. It is not allowed to write to a vm disk at the same time.
not true that you cannot write to the same vm disk at the same time - you have a shared disk option
The Concept of Linbit (http://www.linbit.com/en/company/news/333-high-available-virtualization-at-a...) seems to much of an overhead with the iSCSI Layer and pacemaker setup. Its just too much for such a simple task.
Please tell me that this concept is great and will work and scale well. Otherwise i am also thankful for any hints or critical ideas.
Thanks a lot, Mario
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
-- Dafna Ron

Hello Ron, thanks for your reply. This post is also not meant to be a iscsi discussion.
Since oVirt does not support DRBD out of the box i came up with my own concept:
check out posix storage domain. If it supports gluster you might be able to use it for DRBD.
Sorry, I dont quite understand how posix will work with gluster here. What would the architechture look like and on what layer would it replicate. With "gluster" the glusterfs comes into my mind. How will drbd come into place here? Thanks a lot! Mario

------=_Part_745664_594869330.1391608198104 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit
From: "ml ml" <mliebherr99@googlemail.com> To: Users@ovirt.org Sent: Wednesday, February 5, 2014 12:45:55 PM Subject: [Users] Will this two node concept scale and work?
Hello List,
my aim is to host multiple VMs which are redundant and are high available. It should also scale well.
I think usually people just buy a fat iSCSI Storage and attach this. In my case it should scale well from very small nodes to big ones. Therefore an iSCSI Target will bring a lot of overhead (10GBit Links and two Paths, and really i should have a 2nd Hot Standby SAN, too). This makes scalability very hard.
This post is also not meant to be a iscsi discussion.
Since oVirt does not support DRBD out of the box i came up with my own concept:
http://oi62 . tinypic .com/2550xg5. jpg
As far as i can tell i have the following advantages: -------------------------------------------------------------------- - i can start with two simple cheap nodes - i could add more disks to my nodes. Maybe even a SSD as a dedicated drbd resource. - i can connect the two nodes directly to each other with bonding or infiniband . i dont need a switch or something between it.
Downside: --------------- - i always need two nodes (as a couple)
Will this setup work for me. So far i think i will be quite happy with it. Since the DRBD Resources are shared in dual primary mode i am not sure if ovirt can handle it. It is not allowed to write to a vm disk at the same time.
does not support <span style=3D"background:none repeat scroll 0% 0% yello= w" class=3D"">DRBD</span> out of the box i came up with my own concept:<br>= </div><div><br><a href=3D"http://oi62" target=3D"_blank">http://oi62</a>.<s=
<div><br></div><div>Best regards,</div><div>-- <br></div><div><span name= =3D"x"></span>Didi<span name=3D"x"></span><br></div><div><br></div></div></=
I don't know ovirt enough to comment on that. I did play in the past with drbd and libvirt (virsh). Note that having both nodes primary all the time for all resources is calling for a disaster. In any case of split brain, for any reason, drbd will not know what to do. What I did was to allow both to be primary, but had only one primary most of the time (per resource). I wrote a script to do migration, which made both primary for the duration of the migration (required by qemu) and then moved the source to secondary when migration finished. This way you still have a chance for a disaster, if there is a problem (split brain, node failure) during a migration. So if you decide to go this way, carefully plan and test to see that it works well for you. One source for a split brain, for me, at the time, was buggy nic drivers and bad bonding configuration. So test that well too if applicable. The approach I took seems similar to " DRBD on LV level " in [1], but with custom scripts and without ovirt. You might be able to make ovirt do this for you with hooks. Didn't try that. An obvious downside to this approach is that if one node in a pair is down, the other has no backup now. If you have multiple nodes and external shared storage, multiple nodes can be down with no disruption to service if the remaining nodes are capable enough. [1] http://www.ovirt.org/Features/DRBD Best regards, -- Didi ------=_Part_745664_594869330.1391608198104 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <html><body><div style=3D"font-family: times new roman, new york, times, se= rif; font-size: 12pt; color: #000000"><div></div><blockquote style=3D"borde= r-left:2px solid #1010FF;margin-left:5px;padding-left:5px;color:#000;font-w= eight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,A= rial,sans-serif;font-size:12pt;"><b>From: </b>"ml ml" <mliebherr99@googl= email.com><br><b>To: </b>Users@ovirt.org<br><b>Sent: </b>Wednesday, Febr= uary 5, 2014 12:45:55 PM<br><b>Subject: </b>[Users] Will this two node conc= ept scale and work?<br><div><br></div><div dir=3D"ltr"><div>Hello List,<br>= <div><br></div></div><div>my aim is to host multiple <span style=3D"backgro= und:none repeat scroll 0% 0% yellow" class=3D"">VMs</span> which are redund= ant and are high available. It should also scale well.<br><div><br></div></= div><div>I think usually people just buy a fat <span style=3D"background:no= ne repeat scroll 0% 0% yellow" class=3D"">iSCSI</span> Storage and attach t= his. In my case it should scale well from very small nodes to big ones.<br>= </div><div>Therefore an <span style=3D"background:none repeat scroll 0% 0% = yellow" class=3D"">iSCSI</span> Target will bring a lot of overhead (10GBit= Links and two Paths, and really i should have a 2nd Hot Standby SAN, too).= This makes scalability very hard.<br><div><br></div>This post is also not = meant to be a <span style=3D"background:none repeat scroll 0% 0% yellow" cl= ass=3D"">iscsi</span> discussion.<br><div><br></div></div><div>Since <span = style=3D"background:none repeat scroll 0% 0% yellow" class=3D"">oVirt</span= pan style=3D"background:none repeat scroll 0% 0% yellow" class=3D"">tinypic= </span>.com/2550xg5.<span style=3D"background:none repeat scroll 0% 0% yell= ow" class=3D"">jpg</span><br><div><br></div></div><div>As far as i can tell= i have the following advantages:<br></div><div>---------------------------= -----------------------------------------<br>- i can start with two simple = cheap nodes<br></div><div>- i could add more disks to my nodes. Maybe even = a <span style=3D"background:none repeat scroll 0% 0% yellow" class=3D"">SSD= </span> as a dedicated <span style=3D"background:none repeat scroll 0% 0% y= ellow" class=3D"">drbd</span> resource.<br></div><div>- i can connect the t= wo nodes directly to each other with bonding or <span style=3D"background:n= one repeat scroll 0% 0% yellow" class=3D"">infiniband</span>. i <span style= =3D"background:none repeat scroll 0% 0% yellow" class=3D"">dont</span> need= a switch or something between it.<br><div><br></div></div><div>Downside:<b= r>---------------<br></div><div>- i always need two nodes (as a couple)<br>= <div><br></div></div><div>Will this setup work for me. So far i think i wil= l be quite happy with it.<br></div><div>Since the <span style=3D"background= :none repeat scroll 0% 0% yellow" class=3D"">DRBD</span> Resources are shar= ed in dual primary mode i am not sure if <span style=3D"background:none rep= eat scroll 0% 0% yellow" class=3D"">ovirt</span> can handle it. It is not a= llowed to write to a <span style=3D"background:none repeat scroll 0% 0% yel= low" class=3D"">vm</span> disk at the same time.</div></div></blockquote><d= iv><br></div><div>I don't know ovirt enough to comment on that.</div><div><= br></div><div>I did play in the past with drbd and libvirt (virsh).</div><d= iv>Note that having both nodes primary all the time for all resources is</d= iv><div> calling for a disaster. In any case of split brain, for any reason= , drbd</div><div> will not know what to do.</div><div><br></div><div>What I= did was to allow both to be primary, but had only one primary</div><div> m= ost of the time (per resource). I wrote a script to do migration, which</di= v><div> made both primary for the duration of the migration (required by qe= mu)</div><div> and then moved the source to secondary when migration finish= ed. This</div><div> way you still have a chance for a disaster, if there is= a problem (split</div><div> brain, node failure) during a migration. So if= you decide to go this way,</div><div> carefully plan and test to see that = it works well for you. <span style=3D"font-size: 12pt;">One source for= </span></div><div><span style=3D"font-size: 12pt;"> a split brain, for me, = at the time, was buggy nic drivers </span><span style=3D"font-size: 12= pt;">and bad bonding</span></div><div><span style=3D"font-size: 12pt;"> con= figuration. So test that well too if applicable.</span></div><div><span sty= le=3D"font-size: 12pt;"><br></span></div><div><span style=3D"font-size: 12p= t;" data-mce-style=3D"font-size: 12pt;"><span style=3D"font-size: 12pt;" da= ta-mce-style=3D"font-size: 12pt;"><span style=3D"font-size: 12pt;" data-mce= -style=3D"font-size: 12pt;">The approach I took seems similar to "<span sty= le=3D"background-color: rgb(255, 255, 255); color: rgb(85, 87, 83); font-fa= mily: 'Venturis Sans', 'Open Sans', sans-serif; font-size: 14px; line-heigh= t: 20px;">DRBD on LV level</span></span></span></span><span style=3D"font-s= ize: 12pt;" data-mce-style=3D"font-size: 12pt;"><span style=3D"font-size: 1= 2pt;" data-mce-style=3D"font-size: 12pt;">" in [1], but</span></span></div>= <div><span style=3D"font-size: 12pt;" data-mce-style=3D"font-size: 12pt;"><= span style=3D"font-size: 12pt;" data-mce-style=3D"font-size: 12pt;">with cu= stom scripts and without ovirt.</span></span></div><div><br></div><div>You = might be able to make ovirt do this for you with hooks. Didn't try that.</d= iv><div><br></div><div>An obvious downside to this approach is that if one = node in a pair is</div><div> down, the other has no backup now. If you have= multiple nodes and</div><div> external <span style=3D"font-size: 12pt= ;">shared storage, multiple nodes can be down with no disruption</span></di= v><div><span style=3D"font-size: 12pt;"> to service </span><span style= =3D"font-size: 12pt;">if the remaining nodes are capable enough.</span></di= v><div><br></div><div>[1] <a href=3D"http://www.ovirt.org/Features/DRB= D">http://www.ovirt.org/Features/DRBD</a><a href=3D"http://www.ovirt.org/Fe= atures/DRBD" data-mce-href=3D"http://www.ovirt.org/Features/DRBD"></a></div= body></html> ------=_Part_745664_594869330.1391608198104--

On Wed, 2014-02-05 at 08:49 -0500, Yedidyah Bar David wrote:
From: "ml ml" <mliebherr99@googlemail.com> To: Users@ovirt.org Sent: Wednesday, February 5, 2014 12:45:55 PM Subject: [Users] Will this two node concept scale and work?
Hello List,
my aim is to host multiple VMs which are redundant and are high available. It should also scale well.
I think usually people just buy a fat iSCSI Storage and attach this. In my case it should scale well from very small nodes to big ones.
Therefore an iSCSI Target will bring a lot of overhead (10GBit Links and two Paths, and really i should have a 2nd Hot Standby SAN, too). This makes scalability very hard.
This post is also not meant to be a iscsi discussion.
Since oVirt does not support DRBD out of the box i came up with my own concept:
http://oi62.tinypic.com/2550xg5.jpg
As far as i can tell i have the following advantages:
-------------------------------------------------------------------- - i can start with two simple cheap nodes
- i could add more disks to my nodes. Maybe even a SSD as a dedicated drbd resource.
- i can connect the two nodes directly to each other with bonding or infiniband. i dont need a switch or something between it.
Downside: ---------------
- i always need two nodes (as a couple)
Will this setup work for me. So far i think i will be quite happy with it.
Since the DRBD Resources are shared in dual primary mode i am not sure if ovirt can handle it. It is not allowed to write to a vm disk at the same time.
I don't know ovirt enough to comment on that.
I did play in the past with drbd and libvirt (virsh). Note that having both nodes primary all the time for all resources is calling for a disaster. In any case of split brain, for any reason, drbd will not know what to do.
I second that, had many problems without proper fencing and even with fencing.
What I did was to allow both to be primary, but had only one primary most of the time (per resource). I wrote a script to do migration, which made both primary for the duration of the migration (required by qemu) and then moved the source to secondary when migration finished. This way you still have a chance for a disaster, if there is a problem (split brain, node failure) during a migration. So if you decide to go this way, carefully plan and test to see that it works well for you. One source for a split brain, for me, at the time, was buggy nic drivers and bad bonding configuration. So test that well too if applicable.
The approach I took seems similar to "DRBD on LV level" in [1], but with custom scripts and without ovirt.
You might be able to make ovirt do this for you with hooks. Didn't try that.\
You could use drbd9 but I haven't tested it extensively yet. DRBD 9 has primary on write so you have both sides on passive until one of the nodes want's to write. It should automatically become primary then. This has been done by linbit to decrease split brain and expand to more than two nodes. http://www.drbd.org/users-guide-9.0/s-automatic-promotion.html But I don't know why it shouldn't work, maybe not with the node image but you can make a node of a normal rhel/centos/fedora install. One problem I always have with drbd and RHEL/Centos is that when you don't pay for the Linbit support, you don't get access to the repo and drbd is an additional option on RHEL. On Centos and Fedora the version is always lagging behind, so I have to compile the kernel module everytime for a new version or kernel update.
An obvious downside to this approach is that if one node in a pair is down, the other has no backup now. If you have multiple nodes and external shared storage, multiple nodes can be down with no disruption to service if the remaining nodes are capable enough.
[1] http://www.ovirt.org/Features/DRBD
Best regards, --
Didi

Hi, this one lead me to question which drbd version is or will be available in EL 6/7(upcoming). My search so far just revealed there is no official supported version for EL6 and maybe even worse, as far as I looked, will not even be supported in EL7: https://access.redhat.com/site/discussions/669243 But I didn't check if the kernel modules are disabled or just unsupported. Am 05.02.2014 15:13, schrieb Jorick Astrego:
ou could use drbd9 but I haven't tested it extensively yet.
-- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH & Co. KG Königsberger Straße 6 32339 Espelkamp T: +49-5772-293-100 F: +49-5772-293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen

On Thu, Feb 6, 2014 at 9:05 AM, Sven Kieske wrote:
Hi,
this one lead me to question which drbd version is or will be available in EL 6/7(upcoming).
My search so far just revealed there is no official supported version for EL6 and maybe even worse, as far as I looked, will not even be supported in EL7:
https://access.redhat.com/site/discussions/669243
But I didn't check if the kernel modules are disabled or just unsupported.
If you want complete support for DRBD on RH EL (5.x and 6.x at the moment as 7 is still in beta) see here (main part is accessible also without red hat portal login): https://access.redhat.com/site/solutions/32085 On that link there are also elrepo links to drbd 8.3 and 8.4 packages that could be used to begin using and evaluate for your needs (so not supported) You can also download from linbit after a free login registration or download source code otherwise. Partnership should be there since 2011: http://www.linbit.com/en/company/news/12-linbit-enhances-red-hat-enterprise-... HIH, Gianluca
participants (6)
-
Dafna Ron
-
Gianluca Cecchi
-
Jorick Astrego
-
ml ml
-
Sven Kieske
-
Yedidyah Bar David