Thanks, calculating subtrees only once makes sense. But can't you go further (as in TeX and in the comment you're replying to) — that is, if there are two "paths" of breaking that eventually end with the same break at the same indentations (plural because of the matching-siblings issue), then you should have to recalculate the rest only once?
Another question: I must confess not understanding everything clearly, but it sounds like you pop off the lowest-cost breaks until finding a solution that fits the column limit. Couldn't there be cases where avoiding the "best" split in the first step could ultimately result in a lower overall cost? Optimizing over such solutions is the main thing that makes TeX's paragraph-breaking better than the usual greedy algorithms…
> that is, if there are two "paths" of breaking that eventually end with the same break at the same indentations (plural because of the matching-siblings issue), then you should have to recalculate the rest only once?
In theory, yes. But there are other constraints across a statement that the formatter tries to preserve, so the memoization needs to take those into account too. Basically, the key used in the memo table is more complex than just "where in the code" and "indentation level". It has to roll in some extra contextual information about other line breaking choices that were made if those choices force this subtree to be formatted in certain ways.
> Couldn't there be cases where avoiding the "best" split in the first step could ultimately result in a lower overall cost?
Yes, there exactly are, which is why it does indeed find the overall lowest cost solution.
Another question: I must confess not understanding everything clearly, but it sounds like you pop off the lowest-cost breaks until finding a solution that fits the column limit. Couldn't there be cases where avoiding the "best" split in the first step could ultimately result in a lower overall cost? Optimizing over such solutions is the main thing that makes TeX's paragraph-breaking better than the usual greedy algorithms…