Re: Who's misleading?
Seems like you're trying to have it both ways on performance, Jim. On the one hand, you claim that Ceph must be faster because it uses more spindles, but then you say it's unfair when a test makes it . . . use more spindles. Wait, what? Let's talk a bit about striping and performance testing to see how absurd that is.
Striping is a tradeoff. On the one hand, it can improve single-file performance because of greater parallelism. On the other hand, it can make performance worse because each request to the drives will be smaller, plus overhead from breaking up and recombining requests, managing more active connections, etc. We've supported striping just about forever, because of those few cases where it improves performance, but it turns out that those cases are few indeed and that's why we also haven't turned it on by default. That might actually hurt us in a few single-stream tests, but single-stream is the wrong way to measure the performance of a distributed filesystem anyway. Is there something bad or unfair about having defaults that reflect knowledge learned from running more realistic tests? Whose fault is it that Ceph suffers from its out-of-the-box configuration making a tradeoff that's bad in a realistic test? Not the testers'. Ceph's.
By the way, it should also be clear that the "narrower set of drives" claim is bogus. Between the fact that a real test or a real deployment will involve concurrent I/O across many replica/stripe sets, and that we *can* do striping as well, GlusterFS is just as capable of using every single drive in a cluster simultaneously as Ceph is. It's irresponsible to make assumptions of which "should" perform better without considering the workload, or directly addressing (ideally measuring) the effects of greater parallelism vs. smaller requests etc. I stand by my assertion that even flawed data is better than no data. There is *no evidence* that Ceph is or ever will be faster. There's only speculation, which is not only unsupported by empirical data but doesn't even stand up to a theoretical analysis.
Awarding Ceph two points on performance based on *nothing at all* is egregious enough, but giving them a point for management is even worse. First, good examples and tutorials on a website are no susbstitue for a true and authoritatively documented single-point-of-control CLI. Second, neither of those things can make the software magically capable of online upgrade You either have it or you don't. Third, the examples and tutorials *aren't actually that good*. They contradict each other all over the place, even (last time I had to set up Ceph for testing myself) on something as basic as whether to use mkcephfs or ceph-deploy. Yes, use the new hotness, says one document. That's not quite ready, says another. Basic options, such as the one to use an existing filesystem instead of ruining a well tuned system by building new ones, are barely documented at all and only in the most obscure places. If I hadn't known about them from having set up Ceph multiple times over the years I wouldn't even have found anything suggesting I should look for them. I'm certainly not going to say the GlusterFS management layer is perfect, but it set the bar that Ceph has been trying to reach and they're just not there yet.
You can keep awarding points to Ceph all you like, Jim. It doesn't mean anything. As I've said many times, Ceph is a fine project. I have the utmost respect for everyone involved. Nonetheless, anyone who looks at the facts can see that the list of areas where GlusterFS has managed to pull ahead is much longer than the list of areas where the converse is true. Go ahead and ask your Ceph friends for some real performance numbers. Ask them about their roadmap for filesystem-independent snapshots, or geo-replication, or encryption, or for that matter ask whether the filesystem metadata part is ready for production yet. If you look at what they can each do in the real world today, not what they can do theoretically or in the future, I don't think the result would look at all like the picture you've painted.