For symmetric encryption, if you’re recommending Salsa20/ChaCha20, it is absolutely necessary to discuss nonce management, since this is a major footgun people coming from AES may not be familiar with. You should always use the extended nonce variants of these algorithms (XSalsa20/XChaCha20) if possible, with a random nonce for every message. If not, you will have to be certain that nonces are never reused with the same key, possibly through some counter construction. The real solution to symmetric encryption for most people is to use something like Sodium‘s `crypto_secretstream`, which smooths out all the rough edges.
“Use ECC” is too generic as advice for asymmetric encryption. Use Elliptic-Curve Diffie-Hellman for key exchange (X25519 ideally), and then use a symmetric AEAD construction (XChaCha20-Poly1305 or AES-GCM) to actually encrypt messages. For people familiar with RSA, in which the asymmetric construction is actually used to encrypt messages, this is unfamiliar, so explanation is necessary.
I would not recommend just “SHA-2” as the first choice for generic hash algorithms anymore, due to length extension attacks. Use BLAKE2b, SHA-3, or one of the well-studied truncated variants of SHA-2.
Also, I think monocypher was written independently of NaCl, it’s not a fork.
This is a fork of tptacek's 2015's Cryptographic Right Answers Gist [1]. I think the original 2015 file is somewhat better than this fork. The fork is more up-to-date, but just offers too many options and is probably to confusing for a beginner. As far as I know, the latest "official" update to Cryptographic Right Answers is the Latacora blog post from 2018 [2].
Both the the 2015 version of Right Answers and the OP best practice guide mention (non-extended) ChaCha20-Poly1305, but if you look at their order of priorities, using NaCl/libsodium/monocypher is always mentioned first. That gives you XSalsa20-Poly1305 (NaCl, libsodium default) or XChaCha20-Poly1305 (monocypher, optional for libsodium). the non-extended ChaPoly20 is mentioned as lower priority than the extended versions, but higher priority than AES-GCM, which also features short nonces.
The same argument goes for ECC. The actual "Use" line in the document mentions NaCl, libsodium and monocypher, all of them use X25519 by default, although monocypher does not seem to offer an asymmetric encryption primitive. The main issue is the confusing language talking about ECC, when we know than some ECC (yes, I'm looking at you ECDSA) is not strictly better than RSA [3].
None of the answers in the OP guide seems wrong per se (I didn't review this thoroughly FWIW and I'm not an expert). But I'm still recommending this one, since it's simpler, and "simpler" is the entire point of this kind of guide. You want to avoid programmers shooting themselves in the foot - and shooting yourself in the foot is really easy when you're implementing cryptography.
> monocypher does not seem to offer an asymmetric encryption primitive.
Neither do NaCl and Libsodium. Their `crypto_box()` is a construction that does key exchange, derives a key from the resulting key exchange, and finally use that key to perform symmetric authenticated encryption. I simply omitted that particular construction for Monocypher.
I've often asked be why. My reason is that the NaCl libraries (all 3 of them) are low-level, and a straightforward application of `crypto_box()` lacks the security properties we've now come to expect of modern key exchanges, most notably forward secrecy. To get up to that level would require implementing Noise, and I personally feel that's a tad out of scope. I reckon however that higher-level libraries that implement full protocols however are sorely needed.
> For symmetric encryption, if you’re recommending Salsa20/ChaCha20, it is absolutely necessary to discuss nonce management, since this is a major footgun people coming from AES may not be familiar with.
The most popular AES mode here is AES-GCM, where nonce reuse is even more catastrophic than it is with ChaPoly. People coming from AES thus are familiar with nonce-reuse problems, unless they’re (i) incompetent or (ii) only ever used a nonce-misuse resistant construction, and those are still pretty niche.
Agreed that extended nonces are the safest default.
> “Use ECC” is too generic as advice for asymmetric encryption.
Which is why that’s not the actual advice? There’s a discussion about why ECC is better, but when it comes to the actual recommendations, I read:
Asymmetric Encryption
Use: NaCL, libsodium, or monocypher
Asymmetric Signatures
Use, in order of preference:
1. NaCL, libsodium, or monocypher
2. Ed25519
3. RFC6979 (deterministic DSA/ECDSA)
> Also, I think monocypher was written independently of NaCl, it’s not a fork.
As the author of Monocypher I can answer that one: though I did independently implement it, my choice of primitives puts its squarely in the NaCl family, and closest to libsodium. I even use libsodium to generate most of my test vectors. And with the exception of Elligator and streaming encryption we use compatible wire formats. And finally I did shamelessly stole some code from the Ref10 implementation of Curve25519, and the bignum arithmetic is still there.
Under my definition of a fork (clone repo then branch off), it’s not. But under a looser definition… I’m not too mad about it.
(Now I’m wondering how much of a fork of NaCl libsodium actually is…)
> For people familiar with RSA, in which the asymmetric construction is actually used to encrypt messages, this is unfamiliar, so explanation is necessary.
Are there really people using RSA for message encryption? That sounds very wasteful as it's going to spend a huge amount of CPU cycles for no good reason.
I probably should have said “can be”. The way RSA is often explained just details the mathematics, and explains how a message could be encrypted directly using RSA. It’s then not an unreasonable assumption for a complete beginner that this is how RSA works in practice, even though this is not the case.
Let's just put it like this: Most "simple" explanations of RSA are wrong.
The "advantage" of ECC is that there are no "simple" explanations of ECC, because there's no comic version of ECC that is insecure and easy to explain. For RSA, such an insecure comic version exists. However, I never found that a very convincing argument against RSA.
The Discrete Log Problem is relatively simple to explain in the context of a generic group. It's sort of intuitive that elliptic curve groups are a pretty good instance of a generic group.
So I don't think it is simpler to explain the security of RSA than the security of ECC.
Additionally, the best attack on ECDLP (Pollard's rho) is much easier to understand than the best attack on RSA (the number field sieve).
Can someone help me understand this recommendation:
Under symmetric encryption, the authors write:
> If you are in a position to use a key management system (KMS), then you should use KMS. If you are not in a position to use KMS, then you should use authenticated encryption with associated data (AEAD).
These seem orthogonal to me. KMS := how keys are generated and distributed to communication partners. AEAD := how data is encrypted between communication partners using those keys.
How can it be “use a KMS if you can _or else_ use AEAD”? Shouldn’t it be “and”? What am I missing?
I think this was copied from Latacora’s cryptographic right answers without some of the necessary context. It’s specifically talking about the KMS offerings from AWS/Google Cloud, which provide trusted hardware implementations of not just key management, but also symmetric/asymmetric encryption, HMAC, etc. All the symmetric constructions provided by these platforms are AEADs, so the point is, if you’re using AWS’s KMS, don’t think about it, just use the default. Which is fairly sensible advice.
Depends these days KMS also extends to solutions that provide full on encryption as a service such as Vault. If your design allows for a trusted and well vetted EAAS solution to be used that should be the first you go for.
The stack exchange article seems to ascribe the risk to using MD5. While adding a (global or appended data) hash as you suggest cannot hurt, I wonder if the suggested weakness exist for sha-512.
Do people realize that Keccak replaces a lot of complexity and offers powerful composable primitive already? Instead of saying "blake is faster, but sha3 is standard", pick SHAKE-128 and build everything with it: hash, MAC, symmetric encryption, fiat-shamir transcripts etc.
> Do people realize that Keccak replaces a lot of complexity and offers powerful composable primitive already?
It’s a bit more complicated than that. If you look at the various constructions proposed by the Keccak team, most notably Farfalle and Kangaroo12 I believe, you’ll notice they adjust the number of rounds given the use case. That is, they’re not just proposing a secure permutation that let us compose any construction for which we could devise a security reduction. I mean you could do that, but the results would likely be slower than ideal. https://eprint.iacr.org/2019/1492.pdf
Instead they are tailoring their constructions to lower the requirements on the underlying permutation, in some cases allowing reduced rounds. The price they pay for that is re-doing the entire cryptanalysis: their constructions are effectively new primitives.
The actual simplification is more for implementers: with one permutation to rule them all our source code becomes quite a bit smaller.
I'd love to hear the authors thoughts on Tink. It seems to have similar properties to NaCL, wondering why it isn't mentioned. I'm guessing low adoption?
I laughed when I read that GCM is hard for library authors. I remember trying to implement GCM but failing, so I decided to transcribe a "simple implementation" but failed at that also.
Then I decided to try OCB mode and it worked on the first try.
I think it would have been far more popular if Rogaway hadn't kept it patented well beyond the point where it was clear patented crypto had no future. It's a pretty neat solution to the problem and the continued insistence on restricting its usage with a patent is one of the more baffling things I've seen in the crypto world.
I have no mathematical background at all (I am an orchestra musician), and I did get it to mostly work. It was just that it didn't work for some inputs, and I could never figure out why despite having access to a proper debugger and a good repl (in scheme).
OCB was a breeze in comparison.
Caveat: I never understood the birthday attack from Ferguson on OCB and why it doesn't work on GCM, so I am really not the right person to make any recommendations
This guide seems to be full of close enough but not quite right recommendations which in this specific field can be very dangerous especially as it seems to be recommending specific implementations rather than just giving some generic advice on what to look for.
I noticed there's no mention of quantum-resistant crypto. Looking around, it looks like this is the rational [1]. This sort of feels like a hand-wave.
> Quantum computing does give us some far more efficient algorithms that classical computing cannot achieve, but even then, 256-bits still remains outside of the practical realm of mythical quantum computing when brute force searching.
For symmetric encryption, there is no point looking at "quantum resistance" because even if somebody has a non-toy quantum computer they get a speed-up equivalent to halving key length, so, just use longer (256 bit) keys and stop worrying about it.
For asymmetric encryption, nobody knows for sure, and the best available guesses are expensive with as-yet unknown reward, if you are vulnerable it's likely to be because you ignored the advice this gave you and insisted on Rolling Your Own. Too bad.
My guess is that in five years we'll be in the same situation. Like Fusion power generation, Quantum Computers might work "soon" or they might not and that'll remain true until they actually do work, which might never happen. They'll make great MacGuffins for Hollywood meanwhile and shouldn't affect what you, a non-expert, do all day.
This is highly dependent on how you look at the thing. Two possible outlooks are attacks you can do right now on stock hardware (typically with GPUs), and what attacks you could do if you had the right kind of custom hardware. As far as I can tell Argon2, being explicitly designed to be memory-hard, is very much tailored to the second one.
I’m also sceptical about the timing based threshold (one second). Computers are a bit faster now than they were in 2015, so there’s a chance that threshold now lowered. I understand why they used that easy to understand shorthand, but it’s unlikely to age well.
One thing everyone do seem to agree on though: Argon2 is stronger with long enough run times, if you can tolerate them. Maybe you don’t if you’re running a server under fairly high login load, but maybe you do if you use some augmented PAKE (say OPAQUE) to offload the expensive computation to the client.
I applaud the person who put this together but my humble opinion is that there are a lot of assumptions here. There are many situations (few I habe encountered first hand) where you need to break the rules and how you break the rules also have best practices that should be mentioned.
Some items that stood out for me after a skim:
- AES-CBC is not unusable, from what I understand with random IV and a good HMAC it is usable?
- "Really, anything RSA", there are protocols that encrypt the key with RSA for key transport and use AES or some other block cipher for the data. As far as I am aware this is safe given the key is derived properly and is not reused. If you have to encrypt using their public key and you are restricted from using EC anything your choices are few.
- Custom transport protocols are sometimes needed. TLS is general purpose which means there are corner cases it can't support once in a whole and you can't always use noise. What are best practices to implement key exchange, manage keys and authentication+integrity?
My feed back is that often people are lazy enough to use a pre-made library but when they can't telling them there is no alternative does not help. But otoh, I get that perhaps going into such detail would make the content of the gist too long.
The main issue with AES-CBC+HMAC (besides speed), is that unless you're using a library that offers a complete primitive that combines these algorithms[1], you will have to combine the encryption and MAC yourself. In this case, you will have to know, that the only safe way to do this with is Encrypt-then-MAC, and you'll also have to remember to do (and in some languages know how to implement!) a constant-time comparison for the HMAC value.
Since it's highly unlikely than an untrained developer will do all of these correctly, the authors are right, IMHO, to avoid mentioning AES-CBC+HMAC. Yes, it can be implemented safely. So does RSA for that matter. But is it likely that most developers (or let's be generous - even more than 10% of them) will implement this correctly? No.
When you need to roll your own crypto (custom transports, RSA combos for compatibility reasons) you'll just have to get an expert to do that. WireGuard implements its own transport and TarSnap implements its own crypto (on top of RSA, no less!) and you don't see the authors of these documents criticizing them. In fact, TarSnap is consistently recommended in this guide.
But the audience for these best practices is not Colin Percival[2]. It's the average developer who usually has an "Intro to Cryptography" background from their CS Major at the best of times.
> Since it's highly unlikely than an untrained developer will do all of these correctly, the authors are right
Oh come on! It's not rocket science, just encrypt and hmac. My point was mentioning that best practicr tip for example will help. People still use aes-ecb because they don't know failing with aes-cbc is better.
> When you need to roll your own crypto (custom transports, RSA combos for compatibility reasons) you'll just have to get an expert to do that
If someone is in a position where they're choosing ciphers and they're not an expert, chances are they won't get funds to hire an expert and they can't tell their boss or customer they can't do it. Help people not shoot their own foot off is all I am saying.
I agree with the audience part. The average dev also does not work in silicon valley or in a FAANG. They work in bigcorp,consulting or government where you do your best with unreasonable requirements and little cash to spare for third party experts.
AES-CBC with a random IV and solid HMAC should be fine, but the point is that, for a non-cryptographer, putting together an authenticated encryption construction yourself has potential footguns. Using a pre-made authenticated encryption construction like AES-GCM or Salsa20-Poly1305 avoids this.
Any platform with hardware acceleration for AES should hopefully have a carryless multiplication instruction anyway, so GCM will be fast. And if not, Poly1305 is so fast that another HMAC construction will perform worse. So there’s really no reason not to just use one of these two in 99% of cases.
I agree that when you have a choice it should be done as you said. I am not even a developer and I've been in situations where aes-gcm or poly1305 (or ed25519 and other popular curves) were availabe. Sometimes it is b.s. internal politics, other times it is third parties that can "only" use a certain set of tools or dependencies and it is either come up with something custom or lose the deal/contract.
> Putting cryptographic primitives together is a lot like putting a jigsaw puzzle together, where all the pieces are cut exactly the same way, but there is only one correct solution. Thankfully, there are some projects out there that are working hard to make sure developers are getting it right.
I found this analogy muddled.
The first sentence sounds like a good thing. There's only one correct solution for a puzzles pieces and that is obvious. We all see when the puzzle's been correctly finished because the pieces only fit one way, and a completed image is the result.
Sounds like a great situation.
Then the second sentence responds as if this is a problem that needs to be fixed.
For a less ambiguous analogy, how about a lego set, without instructions? You're supposed to build one particular model, but could end up building all kinds of things that may well use all the pieces, but don't end up with the correct model.
The cut of jigsaw puzzles determines how they can be put together, but what determines how they should be is the picture painted across them.
So an ideal jigsaw puzzle has no two pieces with the same cut, preventing you from putting any pieces together incorrectly. This is largely impossible. A jigsaw puzzle where every piece has the same cut, however? That’s simply torture.
Ok, thanks - so the "cut exactly the same way" is the key here. It's maybe not so obvious to non-puzzle aficionados? I thought it simply meant "all fit together", the "same way" referring to the overall cut.
Maybe clearer: "each puzzle piece the exact same shape"?
Though it still suffers from the issue, as with a puzzle there's always the congruity of the final image as a hard guide as to what's meant to be correct, regardless of piece shape. That's maybe what muddles the analogy most.
KMS makes sense for encryption of data in cloud, since the companies already have access to data. It’s usually an additional layer of access control, monitoring and compliance.
But it seems a bad practice to have an external company create and manage cryptographic keys, and/or manage encryption of on-premise or personal data (like encrypting your backups with an AWS KMS key, client side, and uploading to S3).
Apart from that, KMS is a solution for key management. The users can use keys with any algorithm.
Also, I am not sure if AWS encryption SDK is particularly of high quality. Anyone knows?
I also am confused somewhat here. With KMS, if you need to encrypt larger payloads, KMS itself is of no help except to generate a data key to use and you are left to either use AwsCrypto, or roll your own encryption using the data key which itself is encrypted by AWS KMS. If you happen to be using a language that does not have a port of the AwsCrypto library I am unclear if say AES CBC is okay or not.
If you are able to use AwsCrypto with KMS, I am assuming that is the recommended pathway as that is the default that AWS provides and I am hoping that AWS has thought it through enough to have a sensible default.
This gist is from 2018, last updated 2019. I wonder if there have been any recent relevant developments to warrant an update to these recommendations, as some comments from the comment section indicate.
What is the state of the art for doing encryption with ECC? The author just says "use NaCl" here but what should I do if I am not in a position to do that but can still use ECC?
My understanding of ECC is that it is not really suitable for encryption as-is, as RSA was, rather it is used for key agreement (somehow through a multi-step process that I do not understand). But it is unclear how much of this is just rumor and implementation limitations.
If you can't use NaCl directly you may still be able to use the underlying "25519" Edwards curve. The point is that it was designed in such a way to make implementation bugs ("bad" points, separate addition/doubling formulas, and other edge cases) either non-existent or at least easy to deal with.
In contrast, ECDSA seems like it was almost designed by the NSA to make it as easy as possible to accidentally introduce an exploitable implementation bug.
You are right that ECC is mainly a key agreement/transport and signing tool, not to be used directly for encryption except in very special cases (e.g. modified ElGamal for verifiable voting schemes).
> The author just says "use NaCl" here but what should I do if I am not in a position to do that but can still use ECC?
Not being in a position to use even a single-file C library like Monocypher (well, 2 compilation units if you want the optional parts), is… well, unusual.
> My understanding of ECC is that it is not really suitable for encryption as-is, as RSA was, rather it is used for key agreement (somehow through a multi-step process that I do not understand)
The steps are: once you’ve done key agreement, you have a shared key. You can then use authenticated encryption with that key. One caveat though is that key agreement often don’t give you an actual key, but a statistically biased shared secret. So the actual steps are:
1. Do key agreement. You now have a shared secret.
2. Hash your shared secret. You now have a key.
3. Encrypt your messages with your key. Use AEAD for this.
Caveat: I omitted a number of important details, most notably forward secrecy.
The author definitely should have clarified this. The standard is to use ECC for key exchange only. This can be done entirely offline - each party chooses a random secret scalar, and multiplies the base point of the curve by that scalar to produce a public point. You publish your public point in advance of communication. When you want to send a message, you multiply the other party’s public point by your secret scalar to obtain a shared key. Then, just use a well-studied symmetric AEAD construction to encrypt messages.
Of course, this doesn’t incorporate any forward secrecy, which is a key benefit of using something like TLS or Noise rather than rolling your own custom protocol.
For symmetric encryption, if you’re recommending Salsa20/ChaCha20, it is absolutely necessary to discuss nonce management, since this is a major footgun people coming from AES may not be familiar with. You should always use the extended nonce variants of these algorithms (XSalsa20/XChaCha20) if possible, with a random nonce for every message. If not, you will have to be certain that nonces are never reused with the same key, possibly through some counter construction. The real solution to symmetric encryption for most people is to use something like Sodium‘s `crypto_secretstream`, which smooths out all the rough edges.
“Use ECC” is too generic as advice for asymmetric encryption. Use Elliptic-Curve Diffie-Hellman for key exchange (X25519 ideally), and then use a symmetric AEAD construction (XChaCha20-Poly1305 or AES-GCM) to actually encrypt messages. For people familiar with RSA, in which the asymmetric construction is actually used to encrypt messages, this is unfamiliar, so explanation is necessary.
I would not recommend just “SHA-2” as the first choice for generic hash algorithms anymore, due to length extension attacks. Use BLAKE2b, SHA-3, or one of the well-studied truncated variants of SHA-2.
Also, I think monocypher was written independently of NaCl, it’s not a fork.