Cryptopals 22 - Crack an MT19937 seed


Challenge 22 in Cryptopals.

This challenge underlines an aspect of pseudo-random number generators, i.e. that we can use them lightly only when implementing some game without too many pretenses, and that any usage for cryptographic reasons MUST take into account a lot more.

The idea behind using a pseudo-random generator for cryptographic reasons - in particular, for encrypting stuff - is simple: if our sequence is sufficiently “random”, we can XOR it with our plaintext and obtain a ciphertext that is virtually indistinguishable from randomness.

In this case, the shared secret between the two parties would be the seed to get the pseudo-random sequence started. Otherwise, the ciphertext would not be recoverable!

When implementing a simple game with an element of randomness, it’s very tempting to seed the generator with the current epoch. Assuming that a game will last more than one second, this provides a fair source of randomness for e.g. moving adversaries or rolling dice.

Doing the same for cryptographic reasons is, too, very tempting, but at the cost of disastrous results. Let’s make some calculations (without taking into account leap seconds):

  • there are 3600 second in one hour
  • there are 24 hours in a day, i.e. 86400 seconds
  • there are (about) 365 days in a year, i.e. about 31.5 million seconds.

If we think about it, it’s not that much to try all the different epochs in the last year to find out the one that might apply to a naive usage of pseudo-random sequences!

The specific challenge requires us to invest a lot less time, i.e. surely below one hour, by picking a random epoch in a random time range and then requiring us to guess the right seed by just taking the first value emitted by the pseudo-random sequence. As you might have guessed so far, it’s just a matter of trying out all of them.

#!/usr/bin/env perl
use v5.24;
use warnings;
use experimental 'signatures';
no warnings 'experimental::signatures';
use Time::HiRes ();

use CryptoPals 'mt19937_factory';

use constant WMIN => $ENV{WAIT_MIN} // 40;
use constant WMAX => $ENV{WAIT_MAX} // 1000;

say q{OK, let's begin...};

my ($t, $r) = random_int();

say "got $r, let's see...";
my $start = Time::HiRes::time();
my $guessed = find_seed($r);
say sprintf 'took %.2f s', Time::HiRes::time() - $start;

die "no guess worked!\n" unless defined $guessed;
die "sorry, wrong guess <$guessed> against <$t>!\n" unless $guessed == $t;
say "seed was: $guessed";

sub find_seed ($reference) {
   my $start = int(time());
   for (0 .. 100_000_000) {
      my $seed = $start - $_;
      my $first = mt19937_factory($seed)->();
      return $seed if $first eq $reference;

sub random_int {
   my $r = mt19937_factory(my $t = int(time()))->();
   return ($t, $r);

sub random_wait { sleep WMIN + rand(WMAX + 1 - WMIN) }

The random_wait function does the sleeping as requested; it’s possible to override the WMIN and WMAX constant values by setting the respective environment variables WAIT_MIN and WAIT_MAX.

The random_int function takes one output from the MT19937 generator and gives it back, together with the seed. This is returned just to allow for comparison with the one we’re going to search, in a real scenario we would not get it!

After we got our random number, it’s time to call find_seed, which starts looking “in the past” to find the right seed. Which it invariably finds, of course.

For cryptographic reasons, it’s better to stick with what is considered up to date at the time of need (as we will see, it’s possible to “easily” reconstruct a clone of the generator by doing enough queries to the generator itself) and it’s also better to rely on a less predictable source of randomness, e.g. a hash calculated from a passphrase (although it’s probably better to gather quality random data from the computer, if possible).

Stay safe and secure!

Comments? Octodon, Twitter, GitHub, Reddit, or drop me a line!