(storage) ceph is amazing

If you haven’t tried out ceph yet, and are not yet completely satisfied with your onprem storage system, I recommend giving it a try.  (Note, it does want a lot of cpu, so heads up on that.)

** I acknowledge that I am currently excited and completely captivated by ceph.  I’m still fairly new to using ceph, so you might want to check my facts, a.k.a. I’m tempting you to start investigating. 😉

Ceph is able to use multiple disks on multiple servers and spread out the load, maintaining two or three copies of data to avoid data loss.  Just look at my humble 4 disk system with the data spread out nearly perfectly across the disks.  (2 nodes with 2 disks each)

Ceph provides cephfs (ceph filesystem), cephrbd (ceph block storage).

Block storage gives you something like a disk, it’s useful for say creating a vm that you later want to hot-swap between servers.

Otherwise, the ceph file system is what you want to use, though not directly.

You’ll end up creating a pool which is based on the ceph filesystem (cephfs), then create pvcs that come from the pool. Also you can use iscsi (which uses cephrbd) or nfs, (which uses cephfs), if you have a consumer that is not able to connect with cephfs or cephrbd directly.

In my experiences working with storage systems and kubernetes persistent volumes, I was not having luck with RWX (read write many), even when the providers claimed they worked (nfs using my own linux server, longhorn which uses nfs for rwx). I found two apps ‘plex’ and ‘nextcloud’ would consistently experience database corruption after only a few minutes.

People continually told me “NFS supports RWX” and “Just use iSCSI, that’s perfect for apps that use SQLite”. I tested these claims and did not get the same result using a TrueNAS Core server.

However, with ceph you can allocate a pvc using cephfs, and this works perfectly with RWX! Awesome! And super fast! All my jenkins builds which run in kubernetes sped up by 15 seconds vs TrueNAS iSCSI, of course … this could just mean the physical disks running with ceph are faster than the physical disks running with my truenas server, can’t be sure.

Now I suspect if I created pvcs using NFS and iSCSI, which work on top of cephfs, that they might also support RWX. I’ve very curious, but given how fast cephfs and that everything is working perfectly, I can’t see any reason to use NFS or iSCSI (except for vm disks).

Using the helm install of ceph you end up with a tools pod that you can exec into and run the ‘ceph’ cli. The dashboard gui is pretty great, but the real interaction with ceph happens at the command line level. I’ve been in there breaking & fixing things and the experience feels like a fully fledged product ready for production. There is so much there I can see someone managing ceph as a career with an available deep dive as far as you are interested in going. If you break it enough and then fix it you get to watch it moving data around recovering things which is as cool as can be to watch, from the most geeky perspective.

In any case, given how long cephfs has been around and the popularity it has in use in production environments, I think I’ve found my storage solution for the foreseeable future.

Ceph – Getting Started

Posted in Infrastructure, Kubernetes.

Leave a Reply