The article was modified since its publication. Last update was 09/10/2017
See also Part II and Part III of this series
tl;dr in this post we are going to describe how to try predict the output of Micali-Schnorr Generator (MS-DRBG) knowing the factorization of the n value. If this sounds like, "why the hell should I care?", you might want to give a look at this great post from Matthew Green about the backdoor in Dual_EC_DRBG. But In a nutshell, quoting Matthew Green :
See also Part II and Part III of this series
tl;dr in this post we are going to describe how to try predict the output of Micali-Schnorr Generator (MS-DRBG) knowing the factorization of the n value. If this sounds like, "why the hell should I care?", you might want to give a look at this great post from Matthew Green about the backdoor in Dual_EC_DRBG. But In a nutshell, quoting Matthew Green :
- Dual_EC_DRBG is not the only asymmetric random number generator in the ANSI and ISO standards (see at the bottom).
- it’s not obvious from the public literature how one would attack the generator even if one knew the factorization of the n values above.
Introduction
The first time I heard about this problem is about couple of weeks ago via this Matthew's tweet:
From there I could "track" the challenge that he launched to enthusiastic mathematician.As a curiosity, the NSA didn’t just standardize Dual EC, they also standardized a second sketchy public key DRBG based on Micali-Schnorr.— Matthew Green (@matthew_d_green) September 22, 2017
Mind that I am yes enthusiastic but not a mathematician, but I still decided to accept the challenge :p
Micali-Schnorr Generator (MS-DRBG)
So now I "accepted" this challenge, but Hey I do not even know that this Micali-Schnorr Generator (MS-DRBG) existed, so time to study. Googling a bit I found out that other people tried this previously but did not succeed ( I guess), not a good sign, but let's do not this discourage us. The best place where I found the explanation for the Micali-Schnorr Generator was this short pdf (while other books and online resource described it as well).
Trying to keep it simple the MS-DRBG given a number n and an RSA exponent e will output a sequence of bits. The slides from NIST explains this in a really simple way:
The logic behind this is extremely trivial, you start from a secret seed s (this is the internal state of the DRBG) and you will use the n and the e from the standard to do a classic RSA encryption aka (s)^e mod (n) then you will:
- output the less significant bit of (s)^e mod (n)
- and you keep the most significant bit as the new state for the subsequent iteration
Well that is basically it. Now that we know better MS-DRBG I hope the question from Matthew results simpler, but basically he was asking: will the knowledge of the factorization of n give an advantage such being able to recover the internal state of the DRBG ? To be fair answering no to this question would be extremely counter intuitive. At the end of the day RSA is completely bound to the secretiveness of the factorization of n and this DRBG is totally RSA, this was what really puzzled me. It must be something but what?
One little thing to note is that it seems that the original definition from Micali and Schnorr and the NIST's standard differ from the number of bit they output.
The NIST standard is much more conservative and outputs about 1/3 of the RSA computation. While it seems that the original version outputs up to 3/4 of the RSA computation and keeps only 1/4 for the internal state (well this give still 128 bits to recover, still pretty many...).
Well well the second version seems easier, so let's start simple.right. But I am referring to the original example (only) for now pic.twitter.com/6QmoRlL8s1— Antonio Sanso (@asanso) September 27, 2017
Predict the output
Ok now I know the problem statement but I still need to solve it. Let's see a bit of literature. For a reason that hopefully I clarify in another post, thanks to Brian Smith, I came to know the Handbook of Elliptic and Hyperelliptic Curve Cryptography book. This book isn't only about elliptic curves (the reason why I bought this book BTW) but it also contains explanation of basic algorithms. My initial though was really simple. If I know the factorization of n say I know n = p * q. I can maybe somehow use the most lethal weapon any cryptographer can use: the Chinese Remainder Theorem (CRT). I successfully used myself a couple of times already [0] and [1]. So what I thought is that in any moment I will have the less significant bit of (s)^e mod (n) and the factorization of n = p* q. The idea is really simple: if I can somehow obtain (s)^e mod(p) and (s)^e mod (q) from this partial knowledge I can finally obtain the full (s)^e mod (n) using CRT!!!!!
So I opened the Handbook of Elliptic and Hyperelliptic Curve Cryptography book at the modular reduction part and I was "attracted" by the Special moduli section. I almost immediately discarded Mersenne primes
from Handbook of Elliptic and Hyperelliptic Curve Cryptography |
because the lack of the 2^128,2^520 interval (the one interesting for cryptography). But turning page I got an "illumination", it talked indeed about Crandall primes of the form p = b^k +c and described an algorithm for reduction for special form moduli
from Handbook of Elliptic and Hyperelliptic Curve Cryptography |
Well bingo, looks like is really what I need. Now assuming p and q are both Crandall primes and p = 2^ 256 + c1 amd q = 2^ 256 + c2.
Lets split a bit the algorithm and comment on it:
- q0 = is half unknown (128 bits) and half known (1/3 of the DRBG output, the most significant bits)
- r0 = is the 2/3 of the DRBG output, the less significant bits.
from Handbook of Elliptic and Hyperelliptic Curve Cryptography |
Predict the output (updated)
So as said above I spotted an issue on my program that invalidates so far my results. But hey not all of them :p (I hope).
Basically I still claim some less exiting result. Indeed using the technique based on Crandall's primes above I clam that I can predict ~ the 128 less significant bits of (s)^e mod(p) and (s)^e mod (q). If you still do not believe me here is the gist with a POC you can run*:
Now while this is not revolutionary this result is new to me. In order to have a rough idea of why this actually works (most of the time) let see how binary multiplication actually works. The article from Wikipedia contains a good example:
As you see the less significant bits of the result depends only on the less significant bits of the second term.
- qi+1 = c1 * q0 and taking the 2^256 most significant bits (this is usually a really small number if c1 is small).
- ri+1 = c1 * q0 and taking the 2^256 less significant bit. This can also be written as c1 * q0 (mod 2^256). Hold on but we know the 128 less significant bits of the second term, and as shown above this means we can also predict the 128 less significant bits of this result.
Now the question is: would knowing the ~ the 128 less significant bits of (s)^e mod(p) and (s)^e mod (q) help to recover the internal state of the DRBG?
Acknowledgement
I would like to thank Matthew Green to give me such a good puzzle :) and Brian Smith for letting me know about the Handbook of Elliptic and Hyperelliptic Curve Cryptography book.
That's all folks. For more crypto goodies, follow me on Twitter.
* I am aware the p and q are not suited for RSA but this is just an example.
Comments