Danger: truncation shouldn't be done by just chopping off the end of a hash; they should be different hash functions with entirely different images.
Compare, for instance, multihash's treatment of SHA512 truncated to 256-bits, and the standardised hash function SHA512/256. The paper which introduced SHA512/256 even gives a generic way to safely truncate SHA512: https://eprint.iacr.org/2010/548.pdf section 5.
Multihash's design begs implementations to validate these things by allowing the sender to arbitrarily truncate them. That's bad.
Yeah this is a good point, which we discussed somewhere (not finding the issue atm). The resolution was to treat those as different hash function codes, because the normal usage of truncating hashes by chopping off bits is extremely common. We had direct use cases from past experience trying to help old systems that did things like take a sha2-512, trunc to 256bits and use instead of sha2-256 (for the speedup on some archs). So we saw the need for _literally_ a different size of the exact same function.
When the functions encourage it, we support the addition of the specific different constructions (if named): https://github.com/multiformats/multihash/blob/master/hashta... -- we did a silly thing with Blake2 where we imported all the valid numbers. (this is suboptimal in table space, but super explicit).
Are there other functions you think we should add for the sha2-256/512 set?
> Multihash's design begs implementations to validate these things by allowing the sender to arbitrarily truncate them
If the sender is manipulating the hash you get (i.e. changing the length prefix counts), you're already in huge trouble. They could change the code and the value too. The threat model here is that the hash you have cannot be altered by the attacker. If the attacker manages to truncate a stream to get you to think it's a shorter hash, the attack fails as you have the length to tell you what you should be expecting. (again, the crux here is that if the attacker can change the length bits they can probably also change the function and own you anyway)
Also-- as noted in https://github.com/multiformats/multihash/issues/70 -- we should make implementations allow clients to lock the hash function and length combinations they want to use, so that attackers cannot manipulate those parameters.
You could be a little bit opinionated and limit the choice of formats to a small set of ones that are a good idea. That would encourage good design and it would allow you to shrink the metadata. Perhaps one byte would suffice. It's not necessary to support every algorithm/length under the sun.
Compare, for instance, multihash's treatment of SHA512 truncated to 256-bits, and the standardised hash function SHA512/256. The paper which introduced SHA512/256 even gives a generic way to safely truncate SHA512: https://eprint.iacr.org/2010/548.pdf section 5.
Multihash's design begs implementations to validate these things by allowing the sender to arbitrarily truncate them. That's bad.