# Cryptography: authenticated encryption ## CS 3710: Intro to Cybersecurity === ## Authenticated encryption --- ## MACs and authenticated encryption <!-- .slide: class="text-center" --> Stream and block ciphers give us a way to guarantee the *confidentiality* of data, but how do we guarantee its integrity? _**Integrity**_ means that an attacker should not be able to modify data --- ## Why integrity matters: malleability The ciphers we've discussed produce _**malleable**_ ciphertexts: an attacker can modify one of these ciphertexts and have it produce a predictable output. --- ## Malleability <figure> <img src="../../img/encryption/imitation_game.webp"> <figcaption> *Source: The Imitation Game* </figcaption> </figure> --- ## Malleability Our PRG-based construction for symmetric encryption with a secret key $k$ is to run $$ Encrypt(m, k) = G(k) \oplus m $$ where $G(\cdot)$ is a pseudo-random generator (PRG) and $\oplus$ = XOR. What happens if an attacker knows $m$ and intercepts the message? --- ## Malleability _**Answer:**_ they can compute $G(k)$! Since we didn't do anything to *authenticate* the message, the attacker can now encrypt their own message and forward it on to the receiver. --- ## Malleability <div class="r-stack"> <div class="fragment fade-out" data-fragment-index=0> <figure> <img src="../../img/encryption/malleability_1.png"style="max-height: 50vh"> <figcaption> </figcaption> </figure> </div> <div class="fragment fade-in-then-out" data-fragment-index=0> <figure> <img src="../../img/encryption/malleability_2.png"style="max-height: 50vh"> <figcaption> </figcaption> </figure> </div> <div class="fragment" data-fragment-index=1> <figure> <img src="../../img/encryption/malleability_3.png"style="max-height: 50vh"> <figcaption> </figcaption> </figure> </div> </div> === ## Message authentication codes --- ## What is a MAC? A _**message authentication code (MAC)**_ is an algorithm that generates a few bytes of data known as a *tag*. This tag can be used to verify the _**authenticity**_ of a message. <figure> <img src="../../img/encryption/pulp_fiction_big_mac.webp"style="max-height: 50vh;"> <figcaption> </figcaption> </figure> --- ## Poly1305 _**Poly1305**_ is a MAC algorithm that takes a 32-byte key and generates a 16-byte tag. It's called Poly1305 because it evaluates a polynomial in the ring $\mathbb{Z}/(2^{130} - 5)\mathbb{Z}$ (but you don't need to know that to use it in practice): $$ p(r) = (c_1r^q + c_2r^{q-1} + \ldots + c_qr) \mod{2^{130} - 5} $$ --- ## Hash-based MAC (HMAC) <div class="fragment semi-fade-out" data-fragment-index=0> _**Hash-based MAC (HMAC)**_ is a convenient method of turning a cryptographic hash function like SHA-256 into a MAC. </div> <div class="fragment" data-fragment-index=0> You can think of HMAC as a _**keyed** hash function_; it's a hash function $h(x,k)$ that takes some data $x$ and a key $k$, and outputs a hash. </div> --- ## Aside: SipHash <div class="fragment semi-fade-out" data-fragment-index=0> _**SipHash:**_ does not provide sufficient guarantees to be used as a cryptographic hash function, but is good enough to be used by HMAC. </div> <div class="fragment fade-in-then-semi-out" data-fragment-index=0> - Unlike e.g. SHA, its output isn't supposed to be uniformly random; _but_ </div> <div class="fragment fade-in"> - it's difficult for an attacker to compute $h(x, k)$ without knowing $k$. </div> <figure> <img src="../../img/encryption/siphash.webp"style="max-height: 30vh;"> <figcaption> </figcaption> </figure> --- ## SipHash: hash flooding <div class="fragment semi-fade-out" data-fragment-index=0> In exchange for its weaker security guarantees, SipHash is extremely efficient. This makes it useful for preventing _**hash flooding attacks**_. </div> _**Example:**_ in Python, you can use a dictionary to perform key-value lookups. ```python >>> headers = {} >>> headers["Content-Type"] = "application/json" >>> headers["Host"] = "www.example.org" ... ``` <div class="fragment" data-fragment-index=0> A `dict` is just a hash table. What are some security properties that you'd want the hash function used by `dict` to have? </div> --- ## SipHash: hash flooding <div class="fragment semi-fade-out" data-fragment-index=0> What if an adversary knows what hashes you're going to compute, and can get you to insert keys that all hash to the same value? </div> <div class="r-stack"> <div class="fragment fade-in-then-out" data-fragment-index=0> <figure> <img src="../../img/encryption/hash_flooding_1.webp"style="max-height: 40vh;"> <figcaption> *Source: Jean-Philippe Aumasson* </figcaption> </figure> </div> <div class="fragment fade-in-then-out" data-fragment-index=1> <figure> <img src="../../img/encryption/hash_flooding_2.webp"style="max-height: 40vh;"> <figcaption> *Source: Jean-Philippe Aumasson* </figcaption> </figure> </div> <div class="fragment fade-in-then-out" data-fragment-index=2> <figure> <img src="../../img/encryption/hash_flooding_3.webp"style="max-height: 40vh;"> <figcaption> *Source: Jean-Philippe Aumasson* </figcaption> </figure> </div> <div class="fragment fade-in-then-out" data-fragment-index=3> <figure> <img src="../../img/encryption/hash_flooding_4.webp"style="max-height: 40vh;"> <figcaption> *Source: Jean-Philippe Aumasson* </figcaption> </figure> </div> <div class="fragment fade-in" data-fragment-index=4> An attacker can use this to DoS a server by forcing it to perform increasingly expensive key lookups. As these lookups get more expensive, the server has less time to respond to legitimate requests. </div> </div> notes: JPA's slides on defenses against hash flooding attacks: https://www.aumasson.jp/siphash/siphashdos_appsec12_slides.pdf --- ## SipHash: hash flooding _**Solution:**_ generate a random secret key, and then use SipHash! An attacker who doesn't know the key can't compute the hash. --- ## Applications of SipHash <div class="r-stack"> <div class="fragment fade-out" data-fragment-index=0> <figure> <img src="../../img/encryption/pep_456.webp"style="max-height: 40vh;"> <figcaption> Python uses SipHash to hash strings and bytes for its dictionaries. </figcaption> </figure> </div> <div class="fragment" data-fragment-index=0> <figure> <img src="../../img/encryption/siphash_linux.webp"style="max-height: 40vh;"> <figcaption> The Linux kernel uses SipHash internally for its hash tables </figcaption> </figure> </div> </div> notes: - PEP 456: https://peps.python.org/pep-0456/ - SipHash in the kernel docs: https://docs.kernel.org/next/security/siphash.html - `siphash.h`: https://elixir.bootlin.com/linux/latest/source/include/linux/siphash.h --- ## Authenticated encryption An _**authenticated encryption with additional data (AEAD)**_ algorithm is an encryption scheme that takes three inputs: <div class="fragment fade-in-then-semi-out" data-fragment-index=0> - a *plaintext*; </div> <div class="fragment fade-in-then-semi-out" data-fragment-index=1> - a *secret key*; and </div> <div class="fragment fade-in-then-semi-out" data-fragment-index=2> - a *header* (optionally) </div> <div class="fragment" data-fragment-index=3> and produces two outputs: a *ciphertext* (containing the encryption of the plaintext) and a *MAC*. </div> --- ## Authenticated encryption <div class="fragment semi-fade-out" data-fragment-index=0> The decryption algorithm takes the *ciphertext*, *MAC*, and *header* and produces the plaintext. It also checks whether the header or ciphertext have been modified. </div> <div class="fragment" data-fragment-index=0> The "associated data" is not encrypted, but we still check its integrity. This is useful when we have message metadata that we want others to be able to see. </div> --- ## ChaCha20-Poly1305 *ChaCha20-Poly1305* is an authenticated encryption scheme that uses the ChaCha20 stream cipher with the Poly1305 MAC. Some notable uses: <div class="r-stack" style="max-height: 30vh;"> <div class="fragment fade-out" data-fragment-index=0> <figure> <img src="../../img/encryption/age_encryption.png"style="max-height: 30vh;"> <figcaption> *Source: age / Filippo Valsorda* </figcaption> </figure> </div> <div class="fragment fade-in-then-out" data-fragment-index=0> <figure> <img src="../../img/encryption/wireguard.svg"> <figcaption> *Source: Wireguard / Jason Donenfeld* </figcaption> </figure> </div> </div> --- ## ChaCha20-Poly1305 <pre> <code class="python" data-trim data-line-numbers="1-10|3-4|5-6|7|8-10"> >>> import os >>> from cryptography.hazmat.primitives.ciphers.aead import ChaCha20Poly1305 >>> data = b"a secret message" >>> aad = b"authenticated but unencrypted data" >>> key = ChaCha20Poly1305.generate_key() >>> chacha = ChaCha20Poly1305(key) >>> nonce = os.urandom(12) >>> ct = chacha.encrypt(nonce, data, aad) >>> chacha.decrypt(nonce, ct, aad) b'a secret message' </code> </pre> --- ## AES-GCM **_AES-GCM_** (*Galois/Counter Mode*) combines CTR-mode AES with the GHASH hash function to provide message authentication on top of confidentiality. <figure> <img src="../../img/encryption/NIST_approved.webp"style="max-height: 30vh;"> <figcaption> </figcaption> </figure> --- ## AES-GCM-SIV <div class="container"> <div class="col"> There is also a variant of AES, _**AES-GCM-SIV**_ (*SIV* = "Synthetic Initialization Vector"), which provideds nonce misuse resistance in addition. (I.e., you can reuse a nonce without it blowing up in your face) </div> <div class="col"> <figure> <img src="../../img/encryption/aes_algos.webp"> <figcaption> </figcaption> </figure> </div> </div> notes: Nonce misuse resistance: nothing gets revealed if a nonce gets reused for two different messages. If a nonce is reused for the same message, you can find out that the same message was encrypted, but no other information is revealed. --- ## Summary <div class="fragment semi-fade-out" data-fragment-index=0> *Never* use a cipher like AES or ChaCha by itself! Any messages you send are malleable and cannot be authenticated. </div> <div class="fragment fade-in-then-semi-out" data-fragment-index=0> *Message authentication codes (MACs)* are algorithms that generate a tag that can be used to determine whether a message has been tampered. This tag is usually computed on the ciphertext and appended. </div> <div class="fragment fade-in" data-fragment-index=1> In practice, you should use an *authenticated encryption* / *AEAD* schemes like AES-GCM and ChaCha20-Poly1305, which provide both confidentiality *and* integrity. </div>