Cryptopals 29 - Break a SHA-1 keyed MAC using length extension

TL;DR

Challenge 29 in Cryptopals.

So far we learned that we need to add authenticators to our messages, so that they cannot be easily manipulated by an attacker, e.g. to gain extended permissions. Such an authenticator also go by the name of MAC (Message Authentication Code).

One way to build a MAC-generating function might be using a hash function together with a secret key. The way we investigate in this challenge is the one from the previous challenge, i.e.:

MAC(message) = SHA1(key || message)

If the adversary does not know the key, they surely can’t generate a valid MAC, right? RIGHT?!?

Well, no so fast!

As anything in cryptography, the devil is in the details. Surely an attacker cannot generate a valid MAC for any random message, but it turns out that they can generate a valid MAC for a new message that is an extension of another message for which they have a valid MAC (e.g. because it was provided by… us).

Let’s assume that we’re a plain user of a website and we get this JSON back after logging in:

{
    "Data":
"comment1=cooking%20MCs;userdata=foo;comment2=%20like%20a%20pound%20of%20bacon",

    "Authenticator": "60924f646f78bca39e70309a9333cb1669a7919e"
}

This is the situation: we have a message (the Data) and we have a valid MAC for that message (the Authenticator).

Every following time we go to the server, we have to provide the Data and the Authenticator: it will validate them (thanks to the knowledge of the secret key) and grant us permissions based on what’s in the Data.

One characteristic of SHA-1 is that it can be easily used incrementally. In other terms, it’s possible to feed data by chunks, which is very convenient because it allows calculating the digest of very big files without the need to keep them entirely in memory at the same time.

After we have fed all the data, to get the hash value back the algorithm “closes” the stream by appending some data (a padding) and doing the final calculations upon these added data.

In other terms, we have this:

SHA1(message) = SHA1_OPERATIONS(message || padding)

The padding is no secret, because it’s determined by the length of the message (in bits) alone. This equivalence can be abused to calculate another valid MAC starting from the one we have:

SHA1(message || padding || sneaked-data) =
    SHA1_OPERATIONS(message || padding || sneaked-data || new-padding)

As the SHA1_OPERATIONS proceed incrementally, we can use the available valid MAC to pre-warm our SHA1 calculator by using the MAC directly:

SHA1(message || padding || sneaked-data) =
    SHA1_STARTING_FROM(valid-prefix-MAC, sneaked-data)

This new digest, then, is for the forged message message || padding || sneaked-data, which we need to provide to the server together with our newly calculated SHA1 digest. This will require some trial and error, because we don’t know beforehand the length of the secret key, so we cannot be sure about what to put in the padding part.

Well, this is what iteration is for.

Let’s move on to the code. First, we have extended the My::SHA1 class from the previous challenge to include:

a way to easily “warm up” the object, starting from a starter value that is just a previously calculated digest:

sub new ($package, %args) {
   my $self = bless {
      h0 => 0x67452301,
      h1 => 0xEFCDAB89,
      h2 => 0x98BADCFE,
      h3 => 0x10325476,
      h4 => 0xC3D2E1F0,
      ml => 0,           # message length, in bits
      left => '',        # leftover not reaching 512 bytes
      %args, # this can override anything
   }, ref($package) || $package;
   if (defined(my $starter = delete $self->{starter})) {
      $self->@{qw< h0 h1 h2 h3 h4 >} = unpack 'N5', pack 'H*', $starter;
   }
   return $self;
}

a way to easily calculate the padding externally, possibly by just providing a length as input (this allows the method to be called as a class method):

sub padding ($self, $length = undef) {
   $length //= $self->{ml};
   my $l512 = (1 + $length) % 512;
   my $n_zeros = 448 - $l512 + ($l512 <= 448 ? 0 : 512);
   return join '', "\x80", "\x00" x ($n_zeros / 8),
      pack 'N2', $length >> 32, $length & 0xFFFFFFFF;
}

This will come handy in our attack crafting.

The server side is simulated like this:

# This is what we have access to, granted by the "server" and provided
# back to us
my $original_permissions = 'comment1=cooking%20MCs;userdata=foo;' .
   'comment2=%20like%20a%20pound%20of%20bacon';
my $original_mac = SHA1_MAC_ps_generate($original_permissions);

sub SHA1_MAC_prefix_secret ($key, $message) {
   return My::SHA1->new->add($key, $message)->hex_digest;
}

sub SHA1_MAC_ps_generate ($message) {
   return SHA1_MAC_prefix_secret(the_key(), $message);
}

sub SHA1_MAC_ps_check ($message, $authenticator) {
   return SHA1_MAC_ps_generate($message) eq $authenticator;
}

sub the_key() { state $key = random_text_word() }

The $original_permissions and $original_mac are the Data and the Authenticator, respectively. The other function that we will be allowed to use as attackers is SHA1_MAC_ps_check, which simulates our attempt to provide a pair of forged $message and $authenticator to sneak in additional permissions.

Now, the attack itself. As we said, we don’t know beforehand the length of the secret, so we will have to try until we succeed, i.e. until SHA1_MAC_ps_check gives us the green light:

# This is what we want to append
my $sneaked_permission = ';admin=true';

# Now we "just" have to try out different secret key lengths
my $original_length = length $original_permissions;
my $key_length = 0;

while ('necessary') {
   my $length_so_far = $key_length + $original_length;
   my $glue_padding = My::SHA1->padding($length_so_far * 8);
   $length_so_far += length $glue_padding;

   # Let's "extend" the MAC we got
   my $forger =
      My::SHA1->new(starter => $original_mac, ml => $length_so_far * 8);
   my $forged_mac = $forger->add($sneaked_permission)->hex_digest;

   # This is the corresponding full permissions we're forging
   my $forged_permissions =
      $original_permissions . $glue_padding . $sneaked_permission;

   last if SHA1_MAC_ps_check($forged_permissions, $forged_mac);
   ++$key_length;
}

say "We're in! Secret key length: $key_length "
   . "(pssst! key was '@{[ the_key() ]}', but we didn't need it!)";

Depending on our guess about the secret key length, we have a different assumption about what was the digested $length_so_far. For the forged MAC, we have to consider the padding part of the digested data too, which is why we also add the length of the $glue_padding.

As anticipated, we’re using our My::SHA1 class to pre-warm the digest calculation with the previous, valid MAC, as well as our guess on how much data was digested so far (the initializaiton for the ml parameters). This gives us $forged_mac by just adding our $sneaked_permissions to the data (calling method hex_digest will add the new padding too).

The last thing we have to calculate is the forged Data, which is easy because we have our guess on the $glue_padding for any specific $key_length we are trying out. This gives us $forged_permissions.

With our candidate $forged_permissions and $forged_mac we try to get in. If our guess of the $key_length is right… we’re in!

Stay safe and secure!

ETOOBUSY 🚀 minimal blogging for the impatient