Making the case for fewer assertions in tests

dpc_pw · on Sept 1, 2020

A whole domain for a rather minor point like this, which is not even all that correct in the first place?

All the arguments in this article promise very minor benefits.

> When your test breaks, by fail or error, further assertions are never executed, and test coverage is reduced.

Oh wow. I wish I had your problems. :D

Sure, a failing assertion can hide a bigger problem being there. But most of the time if one assertion fails, rest of the code is useless anyway, and will just generate noise. You could design assertions in a very sophisticated way to minimize that problem, and optimize amount of information... but that's complex and brittle and time consuming.

The whole idea seems like marginal return optimization that takes way too much effort to be worth it. If you have a decent test coverage with well written tests ... you're golden and you probably have more important things to do than trying to tweak your tests to optimize your assertions just in case something sometimes fails during development work.

AlexanderNull · on Sept 2, 2020

> most of the time if one assertion fails, rest of the code is useless anyway

You either don't write tests or you're already writing them in the right way (sounds like the later). I've seen my fair share of what I would consider compound tests that have multiple asserts in tests that would crash execution of that test even though 3 lines down in that same test is a completely different bit of state being tested. This is hopefully less of an issue in unit tests but my gosh I've seen it way too much in integration tests.

It can get worse still when one of these initial assertions starts failing, a lazy dev goes in to address the problem, finds that one assertion isn't an issue worth addressing for now, labels the whole test as a KnownIssue and moves on leaving us at risk for the other issues covered in the later asserts to break without warning at a later point in time! (only seen this twice luckily)

r0s · on Sept 1, 2020

> You could design assertions in a very sophisticated way to minimize that problem, and optimize amount of information... but that's complex and brittle and time consuming.

My point here is this optimization is really pretty simple. One target state per test, and soft assertions for the rest. It's actually a simplification rather than adding complexity.

Soft asserts make your tests more robust by definition, lots of halting assertions break more often, they are brittle by design.

> If you have a decent test coverage with well written tests ... you're golden

Hey if your tests catch all bugs, it ain't broke, don't fix it!

Totally understandable if you aren't looking for optimizations like this. For the massive test suites I work on, it's necessary to at least consider these concepts.

contravariant · on Sept 1, 2020

>You could design assertions in a very sophisticated way to minimize that problem, [...]

This sounds like the exact opposite way you need to design assertions. If code can continue as normal when an assertion fails then why put the assertion in there anyway?

danpalmer · on Sept 1, 2020

Maybe I'm misunderstanding this, but assuming you require all tests to pass there should be no more code coverage by using "soft assertions", because no assertions have thrown in your tests. The code re-use benefits don't seem that substantial.

I can see this being useful in the development cycle to see more failures that are happening, but there's an overhead to structuring a test that can fail at multiple points – early exits are easy to write.

Maybe I'm missing the point, but I'm not sure I agree that this is something we should be striving for. It doesn't seem to solve problems I have, and seems to introduce additional complexity.

trumpeta · on Sept 1, 2020

I’d say when you run a big suite in CI and it takes an hour and your first assertion of many in a test fails, you don’t know if the others also fail or not. You fix it only to wait another hour for the tests to fail again.

rightbyte · on Sept 1, 2020

I mean you have to run the tests again anyway. If the test is in a invalid state then you can't trust soft asserts that pass. It might be meaningfull but fast aborts is easier to reason about.

Even CUnit has specific test invocations via eg. an interactive ncurses session, so any fancy pancy Jenkins monster surely can retest failing tests seperately?

r0s · on Sept 1, 2020

> If the test is in a invalid state then you can't trust soft asserts that pass.

Agreed, which is why I suggest designing each test for a single target state.

That target may then legitimately benefit from multiple assertions, and you have the same problem.

rightbyte · on Sept 1, 2020

Ye at-least for projects with many developers involved. My experience is that tests that tests too much makes QA collapse when the original authors of the tests are gone and no-one know what tests are actually testing implementation or spec.

I use #ifdef:s to enable "intermediate state" and implementation asserts usually to mark them as development tests.

danpalmer · on Sept 1, 2020

I can imagine a test _suite_ taking hours, but each test should be a small fraction of that surely? And each assertion fails just the test that it's in as tests need to be isolated to be effective?

r0s · on Sept 1, 2020

If you architect each test to have a single assertion, yes, I agree it's essentially the same effect.

ufmace · on Sept 1, 2020

Thought about it, and I don't agree that this is a good idea, outside of a few special cases.

Test coverage from failing tests - I can't see being concerned about the coverage of a suite that isn't completely passing. If you have failing tests, you have bigger problems than your test coverage figures.

As for the test continuing, if an early assertion failed, then your state is unknown and can't be trusted. To do this, you'd need to spend a lot of time for every non-terminating assertion reasoning about all of the possible states your system could have that would have caused that assertion to fail, and come up with reasonable things to do in the rest of the test for all of them. I can't see much value in this effort specifically. Better to let it blow up and try to fix whatever caused the first bad assertion to fail.

I think it might be useful to strive for one assertion per test though. Then, you can get a better idea of what's failing and where by how many tests fail. Sounds a lot like the benefit the author cited. Key difference though - you aren't reasoning about the state of your application after a test failure. The other assertions set up their own state from scratch after the other test's state setup is torn down, so it's either as they expect, or they can blow up too.

tracnar · on Sept 1, 2020

We have a test environment with this taken to the extreme at work, i.e. integration tests where the whole codebase softly fails and continues whatever happens. IMO the drawbacks are worse than the upside of seeing more errors. Now one errors ends up giving you a lot of failing soft assertions after the first failure which are not useful at all. There is certainly benefit to being able to go through a failure, but it seems better handled at the framework level rather than writing the tests with this in mind. There's already enough unknown in the tests to add possible earlier failing assertions.

r0s · on Sept 1, 2020

Interesting, it sounds like you would benefit from breaking each test out into one that focuses on a single target state.

Another layer would be conditional test execution hierarchy where one early failure would skip a set of tests that depend on that state.

tracnar · on Sept 1, 2020

I agree, but the tests are in the order of minutes (sometimes hours), not your typical unit tests. That's basically why it uses soft assertions, it's costly to run. Ideally you'd still want tests with a single assertion/target state, but it's hard to write it that way.

We also do have this conditional test hierarchy, but again it's hard to properly define when the system under test behaves unexpectedly...

r0s · on Sept 1, 2020

> it's hard to properly define when the system under test behaves unexpectedly...

I feel your pain. I think at that point it's good to look deeper and ask if exceptions are being properly thrown, as I try to call for in the final section on Exceptions.

You can't plan for everything.

OnlyOneCannolo · on Sept 1, 2020

Seems like the obvious solution is to use hard assertions unless you have a good reason not to.

tracnar · on Sept 1, 2020

Indeed that's what I try to advocate. Of course it's not always easy to know if something should be a hard assertion or not until you hit a failure, which is where I think some framework support to run it in a 'soft-assertion-by-default' way would be handy.

chrchang523 · on Sept 1, 2020

This has been encouraged by test frameworks for more than a decade; see e.g. https://testing.googleblog.com/2008/07/tott-expect-vs-assert... .

marcosdumay · on Sept 1, 2020

I'm not sure how that would work. After an expect fails, why would I trust the results of any other that is passing of failing?

It is true that the tests should be independent, so that one of them failing wouldn't imply on the other being useless. But that requires much more than simply continuing after an error.

chrchang523 · on Sept 1, 2020

That's why both assert and expect are in that framework. You still use assert when that particular test failure may prevent other tests from working as designed, but you try to use expect for all independent "leaf" tests.

r0s · on Sept 1, 2020

Yes! I'm stoking the flames for these ideas because they were quite beneficial in my experience.

In particular, the code reuse aspect, which I haven't seen any advocacy for previously.

r0s · on Sept 1, 2020

Hey Hacker News, I'm Ross Radford from Austin Texas.

I've been a Senior Engineer in Test for long enough to care about this subject, and I think we'd all be better off using soft assertions, and less assertion in tests generally.

twic · on Sept 1, 2020

The JUnit reference should probably be to ErrorCollector rather than Verifier, since that's what you'd actually use.

ErrorCollector isn't in JUnit 5, and i believe the maintainers think that Assertions::assertAll is a sufficient replacement, which it isn't. I wrote an ErrorCollector for JUnit 5:

https://gist.github.com/tomwhoiscontrary/4fbf20350700d7e1c35...

But i'm sure someone could do a better job of it.

r0s · on Sept 1, 2020

Thanks, I'll check it out.

rgoulter · on Sept 1, 2020

What's the value in code coverage from failing tests?

OnlyOneCannolo · on Sept 1, 2020

The first failure might not be the best indicator of why it failed.

Or you may have much bigger problems but you won't know until you've fixed the first problem, which might not even matter anymore once you fix the big problem.

r0s · on Sept 1, 2020

If one assertion fails early in a test, you lose any subsequent coverage from further assertions.

Maybe you can build your tests such that there's only one assert for each, and if that meets your needs, fine.

If not, or if you're in the process of refactoring your tests into single assertions, this way we can reclaim that coverage.

rgoulter · on Sept 1, 2020

I can appreciate that an automated test that runs to completion will cover more code that a unit test which aborts early. And that using 'soft' asserts lets the test run further than a hard assert.

My question is more basic than that, though.

Perhaps to clarify my understanding: code coverage indicates what code was 'covered' by executed tests. This can be useful to show: what parts work as expected in at-least one case, and what parts of code haven't been covered by tests at all. -- But with code coverage from failing tests, you can't get either of "covered code will work as expected in at-least one case" or "this code isn't covered by tests" (since the code that fails the tests is shown as 'covered'). -- What value do you get from code coverage from failed tests?

r0s · on Sept 1, 2020

Short answer: Not all failed test coverage is invalid.

Granted, unit tests benefit less from soft assertions, but I'll take a shot at an example anyway:

Consider a complex object returned from a unit test result (this would be a single target state). That structure could have a missing field that should be reported, but the rest of the fields are evaluated fine, and so that information would be lost if the first assert halted the test.

For functional tests, the value is usually easier to discern, so for another contrived example:

A login form has a failed assert on it's UI structure, say a button is the wrong color, but the login is successful, and the following assertion post-login is also successful.

Ideally, you would assert post-login tests in separate target state, perhaps mocking the login as a pre-condition to another test. For functional tests that could dramatically increase the testing time and complexity, and so this makes a non-halting assertion valuable.

rgoulter · on Sept 1, 2020

Ah, I misread "test coverage" as "code coverage". Whoops.

> Consider a complex object returned from a unit test result (this would be a single target state). That structure could have a missing field that should be reported, but the rest of the fields are evaluated fine, and so that information would be lost if the first assert halted the test.

Right, I'd understand this as 'soft' assertions provide an easier way of making a composite assertion. e.g. I think with RSpec, its matches can be composed; its matchers can nicely return whether a field is missing, or a custom matcher can be written. -- Either way, the point is that a test failure results in the signal that's wholly useful.

> [...say, for cases where an early part of the test fails in an unimportant way...]

Hmm. I think I'd take this suggestion as: test for 'insignificant' things as low as possible; it's expensive to halt a slow functional test over something that doesn't matter. -- Once something that matters fails, that's the information which is useful for the test to signal.

And maybe it can be useful to warn about insignificant things that do get picked up during the functional test.

r0s · on Sept 1, 2020

I appreciate the conversation! That's interesting about RSpec, I'll read more about it, thanks.