CEPH - Opinions and ROI - Users - oVirt List Archives

newer
oVirt 4.3.10 and ansible default...

CEPH - Opinions and ROI

older
Is it possible to change...

Jeremey Wise

30 Sep 2020 30 Sep '20

11:33 p.m.

I have for many years used gluster because..well. 3 nodes.. and so long as I can pull a drive out.. I can get my data.. and with three copies.. I have much higher chance of getting it. Downsides to gluster: Slower (its my home..meh... and I have SSD to avoid MTBF issues ) and with VDO.. and thin provisioning.. not had issue. BUT.... gluster seems to be falling out of favor. Especially as I move towards OCP. So.. CEPH. I have one SSD in each of the three servers. so I have some space to play. I googled around.. and find no clean deployment notes and guides on CEPH + oVirt. Comments or ideas.. -- p <jeremey.wise@gmail.com>enguinpages.

Attachments:

attachment.html (text/html — 962 bytes)

Reply

Sign in to reply online Use email software

Show replies by date

Matthew.Stier＠fujitsu.com

1 Oct 1 Oct

12:03 a.m.

If you can’t go direct, how about round about, with an iSCSI gateway. From: Jeremey Wise <jeremey.wise@gmail.com> Sent: Wednesday, September 30, 2020 11:33 PM To: users <users@ovirt.org> Subject: [ovirt-users] CEPH - Opinions and ROI I have for many years used gluster because..well. 3 nodes.. and so long as I can pull a drive out.. I can get my data.. and with three copies.. I have much higher chance of getting it. Downsides to gluster: Slower (its my home..meh... and I have SSD to avoid MTBF issues ) and with VDO.. and thin provisioning.. not had issue. BUT.... gluster seems to be falling out of favor. Especially as I move towards OCP. So.. CEPH. I have one SSD in each of the three servers. so I have some space to play. I googled around.. and find no clean deployment notes and guides on CEPH + oVirt. Comments or ideas.. -- p<mailto:jeremey.wise@gmail.com>enguinpages.

Reply

Sign in to reply online Use email software

Strahil Nikolov

9:53 a.m.

CEPH requires at least 4 nodes to be "good". I know that Gluster is not the "favourite child" for most vendors, yet it is still optimal for HCI. You can check https://www.ovirt.org/develop/release-management/features/storage/cinder-int... for cinder integration. Best Regards, Strahil Nikolov В четвъртък, 1 октомври 2020 г., 07:36:24 Гринуич+3, Jeremey Wise <jeremey.wise@gmail.com> написа: I have for many years used gluster because..well. 3 nodes.. and so long as I can pull a drive out.. I can get my data.. and with three copies.. I have much higher chance of getting it. Downsides to gluster: Slower (its my home..meh... and I have SSD to avoid MTBF issues ) and with VDO.. and thin provisioning.. not had issue. BUT.... gluster seems to be falling out of favor. Especially as I move towards OCP. So.. CEPH. I have one SSD in each of the three servers. so I have some space to play. I googled around.. and find no clean deployment notes and guides on CEPH + oVirt. Comments or ideas.. -- penguinpages. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/UTKROHYPKJOXJK...

Reply

Sign in to reply online Use email software

penguin pages

10:47 a.m.

Thanks for response. Seems a bit too far into "bleeding edge" .. such that I should kick tires virtually vs commuting plugins to oVirt +Gluster where upgrades and other issues may happen. Seems like Alpha stage of (no thin provisioning, issues with deleting volumes, no export / import .. which is a big one for me). Do we have a direction where / if it will be more of a first class citizen in oVirt? 4.?? Maybe others in community have it and it is working for them.

Reply

Sign in to reply online Use email software

WK

2 Oct 2 Oct

12:06 p.m.

Yes, we manage a number of Distributed Storage systems including MooseFS, Ceph, DRBD and of course Gluster (since 3.3). Each has a specific use. For small customer-specific VM host clusters, which is the majority of what we do, Gluster is by far the safest and easiest to deploy/understand for the more junior members of the team. We have never lost a VM image on Gluster, which can't be said about the others (including CEPH but that disaster was years ago and somewhat self-inflicted). The point is that it hard to shoot yourself in the foot with Gluster. The newer innovations on Gluster such as sharding and the arbiter node have allowed it be competitive on the performance/hassle factor. Our Ceph cluster is on one of the few larger host installations we have and is mostly handled by a more senior tech who has lots of experience with it. He clearly loves it and doesn't understand why we aren't fans but it just seems to be overkill for the typical 3 host VM cluster. The rest of us worry about him getting hit by a bus. For the record I really like MooseFS, but not for live VMs, we use it for archiving and it is the easiest to maintain as long as you are paranoid with the "master" server which provides the metadata index for the chunkserver nodes. My hope for Gluster is that it is able to continue to improve with some of the new ideas such as the thin-arbiter and keep that performance/hassle ratio high. My worry is that IBM/Redhat makes more money on Ceph consulting, than Gluster and thus contributes to the idea that Gluster is a deprecated technology. On 10/1/2020 7:53 AM, Strahil Nikolov via Users wrote:

CEPH requires at least 4 nodes to be "good". I know that Gluster is not the "favourite child" for most vendors, yet it is still optimal for HCI.

You can check https://www.ovirt.org/develop/release-management/features/storage/cinder-int... for cinder integration.

Best Regards, Strahil Nikolov

Reply

Sign in to reply online Use email software

thomas＠hoberg.net

4 Oct 4 Oct

11:28 a.m.

Thanks a lot for this feed back! I've never had any practical experience with Ceph, MooseFS, BeeGFS or Lustre: GlusterFS to me mostly had the charme of running on 1/2/3 nodes and then anything beyond at a balanced benefits in terms of resilience vs. performance... in theory, of course. And on top, the fact that (without sharding), if hell broke loose, you'd always have access to the files on the file system below, was a great help in building up enough confidence to go and try it. Politics and real-world came much later and from my experience with TSO, 370s, Lotus Notes and QuickTransit, I can appreciate the destructive power of IBM: Let's just hope they don't give into the temptation of "streamlining their offerings" the wrong direction.

Reply

Sign in to reply online Use email software

Stack Korora

1 Oct 1 Oct

1:33 p.m.

On 2020-10-01 04:33, Jeremey Wise wrote:

I have for many years used gluster because..well. 3 nodes.. and so long as I can pull a drive out.. I can get my data.. and with three copies.. I have much higher chance of getting it.

Downsides to gluster: Slower (its my home..meh... and I have SSD to avoid MTBF issues ) and with VDO.. and thin provisioning.. not had issue.

BUT.... gluster seems to be falling out of favor. Especially as I move towards OCP.

So.. CEPH. I have one SSD in each of the three servers. so I have some space to play.

I googled around.. and find no clean deployment notes and guides on CEPH + oVirt.

Greetings, First, the legalese...the below is all personal view/experiences...I am not speaking on my employers behalf on any of it, I just happen to have most of the experience from my work for them...yadda yadda yadda...blah blah blah...and so on and so forth. This is all just me and my thoughts/opinions. :-D *sigh* It sucks we are in a world where that garbage has to be said when talking about work-related experiences....Anyway... We've been running CephFS since Firefly in 2014. Yeah, I know. We were crazy, but the risk of data loss vs speed was within threshold of what we were trying to do. Fast-forward six years and we've got two CephFS clusters as primary storage for High Performance Clusters where we very much care about performance AND the risk of data loss. We've also got two deployments of oVirt with CephFS as the filesystem. In other words, I've got some experience with this setup and we are /very/ happy with it. :-) I'm so happy with it, that it is easier/faster for me to list the bad than to list the good. 1. Red Hat (our OS to satisfy the "enterprise" check-box for the audit-heads) and I have gone round and round multiple times over the years. In short, don't expect excellent support out of oVirt for Ceph. Want to use Ceph via iSCSI or Cinder? Whooo boy do I have some horror stories for you! One of the many reasons we prefer CephFS. But say that to them and you get blank looks until they've escalated the ticket sufficiently high up the chain, and even then it's not reassuring... However, if you pass CephFS to oVirt as NFS it works...but you don't get the high-availability nor high-performance aspect of scaling your metadata nodes when coming from oVirt. You _SHOULD_ scale your metadata nodes (as with everything in Ceph, scaling in three's is best), but oVirt won't let you mount "cephmds01,cephmds02,cephmds03". It will gladly tell you that it works, but the moment you start a VM on it oVirt freaks out and it has since I reported it years ago (I recently confirmed this again on 4.4 with CentOS8). But if you just mount "cephmds01" and then hack around on your IP routes in your switch to handle the distribution of the data, it's fine. Honestly, even if you just mount a single host and you /know/ that and you _plan_ upgrades/fails/ect around that, it's still fine. It just really sucks that RH pushes Ceph and claims it's a valued FS, but then doesn't really support anything but their cloud variations of Ceph and if you step out of their very narrow definitions you get a *shrug*. Sigh...anyway...digressing from that as this isn't the time/place for my rants. :-D Point being, if you are going RH don't expect to use any of their helper scripts or minimal install builds or anything like that. Minimal OS install, add CephFS drivers, then install oVirt (or...I forget what they call it..) and configure Ceph like you would NFS. Should be fine afterwards. But I've rarely found significant differences between the community version of oVirt and the RH version (when comparing same/similar versions) including the support for Ceph. 2. We get incredible performance out of Ceph, but it does require tuning. Ceph crushes the pre-packaged vendors we ran tests against. But part of the reason is because it is flexible enough that we can swap out the bits that we need to scale - and we can do that FAR cheaper than the pre-packaged solutions allow. Yes, in three's for the servers. Three metadata's, three monitors (we double those two services on the same servers), and storage in blocks of three. If your SSD's are fast enough, 1 SSD per every two spinning disks is a great ratio. And rebuild times across the cluster are only as fast as your back-plane so you should have a dedicated back-plane network in addition to your primary network. Everyone wants their primary network fast, but your backplane should be equally fast if not faster (and no, don't add just one "fast" network - it should be two). So you are going to need to plan and tweak your install. Just throwing parts at Ceph and expecting it to work will get you mixed results at best. 3. I'd never run it at my home. My home oVirt system mounts NFS to a ZFS filesystem. Nothing fancy either. Stripped mirrors ensure good read/write speed with good fault tolerance. I threw two cheap SSD's as a log drive and a cache drive (which these two SSD's made HUGE performance gains for oVirt VM's) and it's been smooth sailing since. It's trivial to manage/upgrade and FAR less over-head than Ceph. That's really just the warnings I've got for you. I'm a HUGE fan of oVirt and we've done some pretty nutty stuff with it in testing and I trust it for multiple environments where we throw some pretty heavy loads at it. I've got TONS of praise for oVirt and the whole team that backs it. It's fantastic. And I do love Ceph (and specifically CephFS) and we get incredible performance that I could gush over all day long. If you are planning on building Ceph on the cheap, plan replications in sets of three, and prepare for lots of tweaking and tuning. If you are in the position to buy, I *HIGHLY* recommend at least talking to https://softiron.com (I do not work for them, I do not get any kick-back from them, I'm just very pleased with their product). They focus on Ceph and they do it well, but they still let you tweak as needed. And since they build off of Arm processors, all the power and heat come from the drives...these things run super-cool. Loads more efficient then the home-built stuff we ran for years. I'm even a huge fan of running oVirt with a CephFS storage! I _REALLY_ wish the combo would be treated better. But most of my frustrations are many years old at this point, and we've figured out workarounds in the meantime. It's too much for me to want to mess with at home, but so long as you plan out your Ceph install and you are just prepared to be the odd-ball using CephFS+oVirt including the workarounds it's a great setup. I absolutely believe that we've gotten a HUGE return on investment into Ceph...but I'm also using it for high-speed data computations in a big cluster. The oVirt + CephFS is an add-on to the HPC + CephFS. The ROI on oVirt is also huge because we were never satisfied with other virtualization solutions and while OpenStack worked for us it was FAR more overhead than we needed or could support with a team as small as ours. So I'm a big believer that our specific use case for both is a massive ROI win. Should you decide to move forward with CephFS + oVirt and you have questions, feel free to reach out to me. No promises that your problems will be the same as mine, but I can at least share some experiences/config-settings with you. Good luck! ~Stack~

Reply

Sign in to reply online Use email software

thomas＠hoberg.net

4 Oct 4 Oct

11:03 a.m.

Thank you very much for your story! It has very much confirmed a few suspicions that have been gathering over the last... O my God! Has it two years already? 1. Don't expect plug-and-play, unless you're on SAN or NFS (even HCI doesn't seem to be in the heart of the oVirt team) 2. Don't expect RHV to be anything but more expensive than oVirt I am running oVirt at home on silent/passive Atoms, because that represents an edge use case for me, where the oVirt HCI variant potentially has its best value proposition: Unfortunately it's insignificant in terms of revenue... I am also running it in a corporate R&D lab on old recycled servers, where there is no other storage, either and I simply don't want to invest in a new Ceph skill, when Gluster based HCI should do it out of the box. IMHO RedHat can't afford to continue treating Gluster the way they seem to: Without HCI, oVirt is dead for me and Gluster on its own is the superior concept to quite a few alternatives. If anything, I'd want Gluster on hardware like Fungible DPUs for mind-bogling HPC throughput. As far as I understand they have just cancelled a major Gluster refactoring, but if that is what it takes, they may just have to start a little smaller, but do it anyway. And of course I want Gluster to switch between single node, replication and dispersion seemlessly and on the fly, as well as much better diagnostic tools.

Reply

Sign in to reply online Use email software

Strahil Nikolov

2:07 p.m.

And of course I want Gluster to switch between single node, replication >and dispersion seemlessly and on the fly, as well as much better >diagnostic tools.

Actually Gluster can switch from distributed to replicated/distributed-replicated on the fly. Best Regards, Strahil Nikolov

Reply

Sign in to reply online Use email software

1768

Age (days ago)

1771

Last active (days ago)

Download

8 comments

7 participants

tags

participants (7)

Jeremey Wise
Matthew.Stier＠fujitsu.com
penguin pages
Stack Korora
Strahil Nikolov
thomas＠hoberg.net
WK