I love the whole “2.3x less CPU overhead per transaction” where Postgres scales ...

acjohnson55 · on July 15, 2023

You generally want to keep your CPU fully utilized. It looks like Oriole is doing significantly more transactions and is CPU-bound, due to much lower IO requirements. The good news is that it implies you could get even more performance out of Oriole by vertically scaling to a more powerful CPU, whereas Postgres would not continue to increase in performance this way.

Those idle times on the Postgres server could be used for something else, if you're thinking in a desktop OS mindset. But for servers, you tend to want machines that are doing one thing and are optimized for that thing.

tanelpoder · on July 15, 2023

> You generally want to keep your CPU fully utilized.

Not in real life concurrent systems where latency matters. In addition to the queuing/random request arrival rate reasons, all kinds of funky latency hiccups start happening both at the DB and OS level when you run your CPU average utilization near 100%. Spinlocks, priority inversion, etc. Some bugs show up that don’t manifest when running with lower CPU utilization etc.

newaccount74 · on July 16, 2023

This is a benchmark that tries to execute as many queries as possible, so the interesting stat is transactions per second, not CPU usage. This benchmark is testing top speed, not real world behaviour.

If you tested both systems with the same workload (eg. a specific number of queries per second), then the average CPU usage would be much lower for the more efficient engine.

The low CPU usage in this benchmark is just a sign that the performance is not CPU bound, but limited by other factors like locking or IO.

mlyle · on July 16, 2023

Here the system is doing a pretty consistent 750k TPS instead of oscillating between 0 and 225k-- often sitting near 0TPS for tens of seconds. Which system do you think will have better latency for any given loading?

deadbeeves · on July 16, 2023

So what you're saying is that an acceptable way to compensate the system's bugginess is by making it more inefficient? I'd rather use a system that's stable under load.

ignoramous · on July 16, 2023

OP has a point [0], though I'm unsure if Kernels under load are really that unstable / untested.

[0] https://brooker.co.za/blog/2021/05/24/metastable.html / https://archive.is/6Qtet

arghwhat · on July 16, 2023

A service should use 100% CPU when loaded fully, anything else means you are suffering from bottlenecks that are actively limiting your throughput. Having lower CPU load because your disks are barely hanging in there is certainly not better.

If you want lower max CPU load, just limit its resources (e.g., CPU quota, cpuset limitation) or load it less.

thayne · on July 16, 2023

> You generally want to keep your CPU fully utilized

Only if your load is very predictable. If there is a chance of a spike, you often want enough headroom to handle it. Even if you have some kind of automated scaling, that can take time, and you probably want a buffer until your new capacity is available.

mlyle · on July 16, 2023

I think many here is misunderstanding what was likely meant: postgresql was not able to use all the available CPU under this situation, in that it was oscillating from 10% to 70% CPU use. That 40% average cpu use isn't an asset on a dedicated database server: it just means that the other 60% of available cycles are a perishable resource that are immediately spoiling.

In that sense, you want to be able to have your database be able to use all the resources available: all the IOPS, all the CPU cycles, etc.

And, of course, the real thing is the amount of work you get done: this thing does more work-- partially by using more CPU cycles, and partially by doing more work per CPU cycle.

raggi · on July 15, 2023

It’s hard to generalize on these points. In a situation where the throughput was inverted but the proportional system usage was the same, you would instead say “you can still vertically scale by adding more disks”, rather than saying adding bigger cpu. It’s not meaningful in isolation.

It may be reasonable to suggest that for a new code base that is cpu bound there’s a good chance there is low hanging fruit for cpu optimizations that may further increase the throughput gap. It’s also the case however that the prior engines tuning starting life on much older computer architectures, drastically different proportional syscall costs and so on, it very often means that there’s low hanging fruit in configuration to improve baseline benchmarks such as these. High io time suggests poor caching which in many scenarios you’d consider a suboptimal deployed configuration.

It’s not just the devil that’s in the details, it’s everything.

raggi · on July 15, 2023

To be a little more clear on what the detail of the benchmark in question is: it’s a benchmark that explicitly exercises a pathological use case for postgresqls current design, one that nonetheless functions, and demonstrates that the advertised engine does not have that pathology. A key takeaway should probably be, if you’re a Postgres user: if your workload looks exactly like this (sparse upserts into a large data set at a high rate) then you might want to evaluate your the runway of your architecture before the geometric costs or latency stalls become relevant - just as for cost analysis of any other system. What is somewhat interesting in this article, and not super clearly presented, is that this workload is actually fairly pathological for most existing engines offering this set of structural and query facilities, and that’s interesting, if this is the niche you need. Most people do some amount of this, but not always at a super high rate, and there are ways to get the same effective writable/readable data using a different schema, while avoiding it. Nice thing here is you can do the one-liner version.

gary_0 · on July 15, 2023

> you can still vertically scale by adding more disks

Parallelizing IO is a lot different from scaling up CPU power, though. I'd imagine DB server IO performance has a lot less lower-hanging fruit than CPU/software performance.

raggi · on July 16, 2023

That depends, the ratio of free bus capacity for data fetch, and free capacity for inter-CPU synchronization is skewed _massively_ in favor of capacity for data fetch. An x86 system is already under-capacity at the cpu/bus interface, which is why we keep throwing more and more cache at the problem and it works.

Similarly in the cloud on AWS fro example, you have publicly available scalability options starting from 5k IOPS up to 2M IOPS, >400x or 3 orders of magnitude. By contrast you're going from 1vcpu to 192 cores, about half the raise, and a lower performance scaling due to the increased cost of cross-package shootdowns.

Yup, they're different, for sure, but the implication that CPU is easier is not all that clear. In either case, with a database style workload, and with either of these engines in practice you're going to hit a limit at the bus in practice long before you hit a limit on compute or disk io, for any sustained workload - bursts are different.

acjohnson55 · on July 15, 2023

That makes sense. I'm mostly just trying to explain the counterintuitive reason that the high CPU usage shouldn't be interpreted as a flaw.

jklehm · on July 15, 2023

My read is that it's at 90% because they are saturating the CPU to that point with the TPS threshold they use for comparison, the TPS of Oriole is constant and way higher than pg in these charts at least.

I'd think the CPU will drop proportionally to the TPS, they just want to show how high it can go here.

aseipp · on July 15, 2023

Yes, but now that your CPU utilization is uncapped, you can more easily scale the utilization down and retain some form of proportional performance, so it doesn't matter. If you capped the system to 60% of your CPU, it might change the overall numbers, but say you're doing 1.8x more TPS at the same usage, it's a win either way. It's not a marketing trick; those numbers come across as "Very good", to me.

If Expensive Server CPU = X dollars per unit, and it's only used at 60% capacity and can realistically only be used at that capacity, then you have effectively just set .4*X amount of dollars on fire, per unit. If you can vertically take a workload and scale it to saturate 90% of a machine, it's generally easy to apply QOS and other isolation techniques to achieve lower saturation and retain some proportional level of performance. The reverse is not true: if you can only hit 60% of your total machine saturation before you need to scale out, then the only way to get to 90% or higher saturation is through a redesign. Which is exactly what has happened here.

pella · on July 15, 2023

With the same equipment, your performance is now five times better. (5X higher TPS) We need to test again with more hardware, but if you can maintain 3 times the performance at the lower end, it could be a good alternative for some users.

"As the cumulative result of the improvements discussed above, OrioleDB provides:

- 5X higher TPS,

- 2.3X less CPU load per transaction,

- 22X less IOPS per transaction,

- No table and index bloat."

avianlyric · on July 16, 2023

It was a performance test, where presumably the objective was to apply the maximum possible load each DB engine could handle, and apply that load continuous for a long period of time.

The CPU load jumping up and down isn’t Postgres “scaling” it Postgres hitting performance bottlenecks on a regular basis, presumably driven by the need to perform vacuums which are very IO insensitive. So instead of using IO to serve queries, Postgres is using IO for janitorial work, and TPS (and thus CPU usage) crater.

Oriole on the other hand manages much higher throughput, and much more consistently than Postgres.

What would you prefer a car that does a constant 100mph when your foot’s down. Or one that wildly oscillates between 40mph and 70mph, despite you trying to put the pedal through the floor?

waterproof · on July 15, 2023

Eyeballing the tps graph, OrioleDB is doing 5x tps while using 2x the CPU. So about 5/2=2.5x the CPU per transaction.

Checks out.

gary_0 · on July 15, 2023

5x tps with 2x CPU is 2/5 = 0.4x the CPU (ie. it's more efficient per transaction).

adsharma · on July 15, 2023

It's not clear if the CPU cost per tx is any worse. Was OrioleDB doing 5x the transactions at this CPU usage?

colanderman · on July 16, 2023

Yes, because they are performing more transactions per second, by virtue of performing less I/O per transaction. This is a good thing.