More

ajb · 2026-04-05T18:38:15 1775414295

Depends on your jurisdiction, of course. (I am not a lawyer and this is not legal advice, merely my impressions). In the UK this would likely be worth it if the injury is a specified financial amount. So for people who have paid for something and simply not got it, a small claims court is a good bet for getting a refund. A lot of the time however, the injury is in the consequences of relying on one of these companies services, and having it withdrawn without notice, as in the OP. Usually, you want service restored, as that is in fact the least costly action for both sides. But small claims courts (in the UK) do not make that kind of order. In theory you could sue for the financial consequences of the abrupt withdrawal, but I'd guess that's too complicated for a small claim.

ajb · 2026-04-04T15:19:12 1775315952

Check your local law. In some jurisdictions, you can charge interest, or penalties. You can be gentle about it - give fair warning, reminder that intrest is starting to accrue, etc. But customers don't generally want liabilities to increase, so will prefer to pay before extra costs are incurred.

Here is one company's experience (UK focused): https://www.revk.uk/2026/03/late-payments.html

It depends a lot on your relationship with the customer as well, I guess. Some may get butthurt about it, for others, your relationship is with a person in a different dept to the people organising the payments, you can send interest notices to the finance dept without worrying the person who wants your services.

ajb · 2026-04-03T04:11:24 1775189484

Suppose you are a building contractor. You have given start dates for future jobs, but your current job is going to run over the expected time. You can choose between:

1 slip every job, annoying all of the customers whose jobs are queued up. You get a bad reputation.

2 Move onto the next job on time, and gradually complete the stalled job in the background by sending workers back to it when you have spare (which you should have, because in general you must overestimate or things will go badly wrong). That customer will now suffer because their job is going to take a multiple of the expected time, but all of the other customers are happy, so your reputation is good.

mhawthorne · 2026-04-03T14:16:51 1775225811

Author here. It's a great analogy.

I had a section in the post I cut out about how optimizing queue selection started out as a technical problem, but transformed into more of a business and ethical problem the more I pondered it.

You're effectively deciding how to distribute suffering across a large group of people.

Comes up in any situation where large metric gains can be accomplished by optimizing for specific groups - recommender and personalization systems are another example.

tatersolid · 2026-04-04T10:00:17 1775296817

I’ve observed airlines will do this as well if they have maintenance or gate queues. They will sacrifice 1-2 flights (hours late or even cancelled) to keep many other flights near on-time. Fewer angry customers, better reported average “on-time” metrics.

ajb · 2026-04-02T00:41:21 1775090481

Not the OP, but Fyi you know that to some extent anyway, because the termination condition is that confidence is above a specified value. This is one of the advantages over just doing git bisect with some finger-in-air test repeat factor. But yeah it can print that too.

ajb · 2026-04-01T22:05:45 1775081145

It's worth noting that the analysis (although not this specific algorithm) applies in cases where there is a deterministic approach, but a nondeterministic algorithm is faster.

For example, suppose you have some piece of hardware which you can interrogate, but not after it crashes. It crashes at a deterministic point. You can step it forward by any amount of steps, but only examine it's state if it did not crash. If it crashed, you have to go back to the start. (I call this situation "Finnegan Search", after the nursery rhyme which prominently features the line "poor old Finnegan had to begin again").

The deterministic algorithm has you do an examination after every step. The nondeterministic algorithm has you choose some number of steps, accepting the risk that you have to go back to the start. The optimal number of steps (and thus the choice of algorithm) depends on the ratio of the cost of examination to the cost of a step. It can be found analytically as the expected information gain per unit time.

(Either way the process is pretty annoying and considerable effort in hardware and software design has gone into providing ways to render it unnecessary, but it still crops up sometimes in embedded systems).

ajb · 2026-04-01T21:41:35 1775079695

At a guess, you can reuse the entropy part, but you'd need to plug in a new probability distribution.

ajb · 2026-04-01T21:30:22 1775079022

In theory, the algorithm could deal with that by choosing the commit at each step, which gives the best expected information gain; divided by expected test time. In most cases it would be more efficient just to cache the compiled output though.

sfink · 2026-04-01T22:11:34 1775081494

This doesn't sound quite right, but I'm not sure why.

Perhaps: a reasonable objective would be to say that for N bits of information, I would like to pick the test schedule that requires the least total elapsed time. If you have two candidate commits and a slow recompile time, it seems like your algorithm would do many repeats of commit A until the gain in information per run drops below the expected gain from B divided by the recompile time, then it would do many repeats of B, then go back to A, etc. So there are long runs, but you're still switching back and forth. You would get the same number of bits by doing the same number of test runs for each commit, but batching all of the A runs before all of the B runs.

Then again: you wouldn't know how many times to run each in advance, and "run A an infinite number of times, then run B an infinite number of times" is clearly not a winning strategy. Even with a fixed N, I don't think you could figure it out without knowing the results of the runs in advance. So perhaps your algorithm is optimal?

It still feels off. You're normalizing everything to bits/sec and choosing the maximum. But comparing an initial test run divided by the rebuild time vs a subsequent test run divided by a much faster time seems like you're pretending a discrete thing is continuous.

I wish I could math good.

ajb · 2026-04-01T22:42:09 1775083329

The general requirement for this approach to be optimal, is called "dynamical consistency". A good description is in [1]. It is the situation where, suppose you have a budget B , and you search until your budget is exhausted. Then you are informed that there is an additional budget, B2, and you can continue searching until that is exhausted. A situation is dynamically consistent if, for any B,B2, the optimal strategy is such that you would make the same choices whether you know that you will get B2 or not.

So you are correct that discreteness is a problem, because if you are nearing the end of the budget you may optimally prefer to get more dice rolls than take bigger bets. But the optimal solution is then often analytically intractable (or at least it was - I last read about this a while back), and the entropy approach is often reasonable anyway. (For cases where search effort is significant, a good search plan can be found by simulation).

[1] https://bayes.wustl.edu/etj/articles/search.pdf

hauntsaninja · 2026-04-02T04:25:37 1775103937

Note that "pick the commit with best expected information gain" in git_bayesect isn't optimal even in the no overhead regime. I provide a counterexample in the writeup, which implies ajb's heuristic is also not optimal. I don't see a tractable way to compute the optimal policy.

One idea: if you always spend time testing equal to your constant overhead, I think you're guaranteed to be not more than 2x off optimal.

(and agreed with ajb on "just use ccache" in practice!)

ajb · 2026-04-01T21:22:35 1775078555

Nice! I implemented a similar thing a while back: https://github.com/Ealdwulf/BBChop

I'm going to have to check out how you got linear time with Shannon entropy, because I used Renyi entropy to do that, to make the algebra easier.

It's also possible to do it over the DAG, rather than a linear history - although that makes the code a lot more complicated. Unfortunately there doesn't seem to be a linear time cumulative sum algorithm over dags, so it's super linear in some cases.

ajb · 2026-04-01T13:19:43 1775049583

The UK regs update[1] mentions a battery, but you have to pay for it so I don't have the details.

It appears your could legally install one of these panels on the 15th of this month, but there presumably won't be any certified to comply with the regs on sale yet.

[1] https://electrical.theiet.org/amendment-4-updates-to-18th-ed...

ajb · 2026-03-29T23:06:29 1774825589

Not the OP, but probably this: https://tonyg.github.io/revctrl.org/GenerationCounting.html

(That seems to be an archive of the old revctrl.org pages from a while back; most likely Bram Cohen has a blog somewhere explaining it in his own words - probably about 2003, at a guess)