Friday, 6 October 2017

How to try to predict the output of Micali-Schnorr Generator (MS-DRBG) knowing the factorization

The article was modified since its publication. Last update was 09/10/2017 

tl;dr in this post we are going to describe how to try predict the output of Micali-Schnorr Generator (MS-DRBG)  knowing the factorization of the n value. If this sounds like, "why the hell should I care?", you might want to give a look at this great post from Matthew Green about the backdoor in Dual_EC_DRBG. But In a nutshell, quoting Matthew Green :
What I am NOT claiming in this post though is that there is a backdoor in one of this standard.

Introduction


The first time I heard about this problem is about couple of weeks ago via this Matthew's tweet:
From there I could "track" the challenge that he launched to enthusiastic mathematician.



Mind that I am yes enthusiastic but not a mathematician, but I still decided to accept the challenge :p
And yeah it looks like it took be about two weeks to (partially) solve the riddle.

Micali-Schnorr Generator (MS-DRBG)


So now I "accepted" this challenge, but Hey I do not even know that this Micali-Schnorr Generator (MS-DRBG) existed, so time to study. Googling a bit I found out that other people tried this previously but did not succeed ( I guess), not a good sign, but let's do not this discourage us. The best place where I found the explanation for the Micali-Schnorr Generator was this short pdf (while other books and online resource described it as well).
Trying to keep it simple the MS-DRBG given a number n and an RSA exponent e will output a sequence of bits. The slides from NIST explains this in a really simple way:


The logic behind this is extremely trivial, you start from a secret seed s (this is the internal state of the DRBG) and you will use the n and the e from the standard to do a classic RSA encryption aka (s)^e mod (n) then you will:
  • output the less significant bit of (s)^e mod (n)
  • and you keep  the most significant bit as the new state for the subsequent iteration
Well that is basically it. Now that we know better MS-DRBG I hope the question from Matthew results simpler, but basically he was asking: will the knowledge of the factorization of n give an advantage such being able to recover the internal state of the DRBG ? To be fair answering no to this question  would be extremely counter intuitive. At the end of the day RSA is completely bound to the secretiveness of the factorization of n and this DRBG is totally RSA, this was what really puzzled me. It must be something but what?
One little thing to note is that it seems that the original definition from Micali and Schnorr and the NIST's  standard differ from the number of bit they output. 
The NIST standard is much more conservative and outputs about 1/3 of the RSA computation. While it seems that the original version outputs up to 3/4 of the RSA computation and keeps only 1/4 for the internal state (well this give still 128 bits to recover, still pretty many...).
Well well the second version seems easier, so let's start simple.


Predict the output 


Ok now I know the problem statement but I still need to solve it. Let's see a bit of literature. For a reason that hopefully I clarify in another post, thanks to Brian Smith,  I came to know the Handbook of Elliptic and Hyperelliptic Curve Cryptography book. This book isn't only about elliptic curves (the reason why I bought this book BTW) but it also contains explanation of basic algorithms. My initial though was really simple. If I know the factorization of n say I know n = p * qI can maybe somehow use the most lethal weapon any cryptographer can use: the Chinese Remainder Theorem (CRT). I successfully used myself a couple of times already [0] and [1]. So what I thought is that in any moment I will have the less significant bit of (s)^e mod (n) and the factorization of n = p* q. The idea is really simple: if I can somehow obtain (s)^e mod(p) and (s)^e mod (q) from this partial knowledge I can finally obtain the full  (s)^e mod (n) using CRT!!!!!
So I opened the Handbook of Elliptic and Hyperelliptic Curve Cryptography book at the modular reduction part and I was "attracted" by the Special moduli section. I almost immediately discarded Mersenne primes

from Handbook of Elliptic and Hyperelliptic Curve Cryptography
from Handbook of Elliptic and Hyperelliptic Curve Cryptography
because the lack of the 2^128,2^520 interval (the one interesting for cryptography). But turning page I got an "illumination", it talked indeed about Crandall primes of the form p = b^k +c and described an algorithm for reduction for special form moduli

from Handbook of Elliptic and Hyperelliptic Curve Cryptography
from Handbook of Elliptic and Hyperelliptic Curve Cryptography

Well bingo, looks like is really what I need. Now assuming p and q are both Crandall primes and p = 2^ 256 + c1 amd q = 2^ 256 + c2.

Lets split a bit the algorithm and comment on it:

  • q0 = is half unknown (128 bits) and half known (1/3 of the DRBG output, the most significant bits)
  • r0 = is the 2/3 of the DRBG output, the less significant bits.
Now is also important to know this:

from Handbook of Elliptic and Hyperelliptic Curve Cryptography
 This is actually again out case. So lets continue:

qi+1 amd ri+1 are going to be really easy to predict for us (giving the fact we have the DRBG 128 most significant bits of the output).
Lets see how. So we said we know the less significant bit of q0 that are equal to the 1/3 of the DRBG output, the most significant bits.
Now decoding the algorithm above notation:


  • qi+1 = c1 * q0 and taking the 2^256 most significant bits (this is usually a really small number for the instance of our problem namely some bits.
  • ri+1 = c1 * q0 and taking the 2^256 less significant bit. This can also be written as  c1 * q0 (mod 2^256). Hold on but we have most of this already (is enough to multiply the part of q0 that we know for c1) apart few bits (again the instance of our problem namely some bits).
This will allow us to recover (s)^e mod(p). If we repeat the same using c2 will allow us to recover (s)^e mod(q). Than CRT and ...

...and that is basically it. I hope you enjoyed as much as I enjoyed thinking about this riddle.
We will provide a much more deeper analysis of this with a some code showing this in the next post.

EDIT: I spotted an issue on my program that invalidates so far my results. I am going to continue working on it though and maybe solve the issue. Or I hope this post will inspire someone else to work on this really great problem.

Predict the output (updated)

So as said above I spotted an issue on my program that invalidates so far my results. But hey not all of them :p (I hope).
Basically I still claim some less exiting result. Indeed using the technique based on Crandall's primes above I clam that I can predict ~ the 128 less significant bits of (s)^e mod(p) and (s)^e mod (q). If you still do not believe me here is the gist with a POC you can run*:


Now while this is not revolutionary this result is new to me. In order to have a rough idea of why this actually works (most of the time) let see how binary multiplication actually works. The article from Wikipedia contains a good example:


As you see the less significant bits of the result depends only on the less significant bits of the second term. 

Now decoding the Crandall's algorithm notation:
  • qi+1 = c1 * q0 and taking the 2^256 most significant bits (this is usually a really small number if c1 is small).
  • ri+1 = c1 * q0 and taking the 2^256 less significant bit. This can also be written as  c1 * q0 (mod 2^256). Hold on but we know the 128 less significant bits of the second term, and as shown above this means we can also predict the 128 less significant bits of this result.

Now the question is: would knowing the ~ the 128 less significant bits of (s)^e mod(p) and (s)^e mod (q) help to recover the internal state of the DRBG?

Acknowledgement


I would like to thank Matthew Green to give me such a good puzzle :) and Brian Smith for letting me know about  the Handbook of Elliptic and Hyperelliptic Curve Cryptography book.

That's all folks. For more crypto goodies, follow me on Twitter.



* I am aware the p and q are not suited for RSA but this is just an example.