Cryptopals 23 - Clone an MT19937 RNG from its output

TL;DR

Challenge 23 in Cryptopals.

As anticipated, using a random sequence to encrypt something is a fantastic idea, using a pseudo-random one is generally a terrible one.

Unless, that is, our pseudo-random generator function has been designed to cope with the stringent needs of cryptography.

It turns out that the Mersenne Twister 19937 PRNG is not designed with that in mind. Which means that we can clone the exact state (and sequence) of a generator with a sample of 624 consecutive outputs from it.

The challenge itself gives us hints on how to do it.

The internal state of the generator is kept in an array with 624 slots. If we manage to re-create it as a whole, we can clone the generator.

The output from the generator, though, does not pick values directly from this state array, because it does a transformation:

      my $y = $state->[$index];
      $y = $y ^ (($y >> $u) & $d);
      $y = $y ^ (($y << $s) & $b);
      $y = $y ^ (($y << $t) & $c);
      $y = $y ^ ($y >> $l);
      ++$index;
      return $y & $wmask;

It turns out that all this shifting and XORing is not losing any information, i.e. each step is invertible (except, possibly, the final & that takes care to restrict the output to the required number of bits). As some operations shift to the right, and others shift to the left, we will call this inverse functions un_right and un_left respectively:

#!/usr/bin/env perl
use v5.24;
use warnings;
use experimental 'signatures';
no warnings 'experimental::signatures';

use CryptoPals qw< mersenne_twister_factory mt19937_factory >;

my $orig = mt19937_factory(time());
say 'collecting...';
my @collected = map { $orig->() } 1 .. 624;
say 'done';

say 'cloning...';
my $clone = reproduce_mt19937_factory(\@collected);
say 'done';

for (1 .. 1000) {
   my $o = $orig->();
   my $c = $clone->();
   my $d = $o - $c and die "$o ~ $c\n";
   say sprintf "%4d. %d - %d", $_, $o, $c;
}

sub reproduce_mt_factory (%args) {
   my ($w, $n, $m, $r, $a, $u, $d, $s, $b, $t, $c, $l, $f, $list) =
     @args{qw< w n m r a u d s b t c l f list >};

   my @MT;
   for ($list->@*) {
      my $y = $_;
      $y = un_right($y, $l);
      $y = un_left($y, $t, $c);
      $y = un_left($y, $s, $b);
      $y = un_right($y, $u, $d);
      push @MT, $y;
   }

   return mersenne_twister_factory(%args, state => \@MT);
}

sub reproduce_mt19937_factory ($list) {
   return reproduce_mt_factory(
      w    => 32,
      n    => 624,
      m    => 397,
      r    => 31,
      a    => 0x9908B0DF,
      u    => 11,
      d    => 0xFFFFFFFF,
      s    => 7,
      b    => 0x9D2C5680,
      t    => 15,
      c    => 0xEFC60000,
      l    => 18,
      f    => 1812433253,
      list => $list,
   );
}

sub un_right ($x, $l, $emask = undef, $size = 32) {
   my $mask = 0;
   $mask = ($mask << 1) | 0x01 for 1 .. $l;
   $mask <<= $size - $l;
   my $full_mask = 0;
   $full_mask = ($full_mask << 1) | 0x01 for 1 .. $size;
   $emask //= $full_mask;
   my $y = 0;
   my $previous = 0;
   while ($mask) {
      my $chunk = ($x & $mask) ^ ($previous & $emask);
      $y |= $chunk;
      $previous = $chunk >> $l;
      $mask >>= $l;
   }
   return $y;
}

sub un_left ($x, $l, $emask = undef, $size = 32) {
   my $mask = 0;
   $mask = ($mask << 1) | 0x01 for 1 .. $l;
   my $full_mask = 0;
   $full_mask = ($full_mask << 1) | 0x01 for 1 .. $size;
   $emask //= $full_mask;
   my $y = 0;
   my $previous = 0;
   while ($mask) {
      my $chunk = ($x & $mask) ^ ($previous & $emask);
      $y |= $chunk;
      $previous = ($chunk << $l) & $full_mask;
      $mask = ($mask << $l) & $full_mask;
   }
   return $y;
}

The key to both function is that in both shifts some bits are left untouched. As an example, in a shift-left inversion by two bits, the lower two bits are left untouched (shifing left means inserting 0 values at the bottom) and can be used to do the XOR again and again, piecewise. The same works for a shift-right inversion, only this time it will beht upper bits to be left unchanged.

As you might have gussed by now, it works! So, although getting 624 consecutive values might not be trivial, it seems definitely doable and this makes MT19937 a very bad candidate for usage in encryption.

Right? RIGHT?!?

Stay safe and secure!


Comments? Octodon, , GitHub, Reddit, or drop me a line!