TL;DR

Let’s continue our journey through graph algorithms and see a compact implementation of the Dijkstra’s algorithm!

You can take a look to the [Wikipedia][] page on Dijkstra’s algorithm to know more. Here we take a look at an implementation in Perl that is very compact… and part of the CodinGame library cglib. Let’s just say that here we implement the single source variant, i.e. fixes one node in the graph as the source and computes the best paths towards each other reachable node.

Remember: cglib is optimized for code compactness, not much for readability 😇

The implementation is in Dijkstra.pm, and leverages the same graph representation that we already discussed in the previous post Generic Graph Representation: nodes are considered (mostly) opaque scalars and the relationships between nodes are encapsulated in a function.

Using the results

The implementation pre-computes everything and keeps the results in an object that is the main entry point, so let’s see that first:

package Dijkstra; # repetita juvant... especially with cut-and-paste
use strict;

sub path_to {
   my ($self, $v) = @_;
   my $vid = $self->{id}->($v);
   my $thr = $self->{t}{$vid} || return;    # connected?

   my @retval;
   while ($v) {
      unshift @retval, $v;
      ($v, $vid) = @{$thr}{qw< p pid >};
      $thr = $self->{t}{$vid};
   }

   return wantarray ? @retval : \@retval;
} ## end sub path_to

sub distance_to { return ($_[0]{t}{$_[0]{id}->($_[1])} || {})->{d} }

The interface provides two functions:

path_to: returns the sequence of nodes to go from the source to a specific node (provided as input);
distance_to: returns the minimum distance between the source and the specific node (provided as input).

There is no new function because… it’s not necessary.

Main function

Here’s the implementation of the single source variant of the algorithm:

sub dijkstra {
   my %args = (@_ && ref($_[0])) ? %{$_[0]} : @_;
   my @reqs = qw< start distance successors >;
   exists($args{$_}) || die "missing parameter '$_'" for @reqs;
   my ($start, $dist, $succs) = @args{@reqs};
   my $id_of = $args{identifier} || sub { return "$_[0]" };
   my %is_goal = map { $id_of->($_) => 1 } @{$args{goals} || []};
   my $on_goal = scalar(keys %is_goal) ? $args{on_goal} || sub {
       delete $is_goal{$_[0]};
       return scalar keys %is_goal;
   } : undef;
  
   my $id      = $id_of->($start);
   my $queue   = PriorityQueue->new(
       before => sub { $_[0]{d} < $_[1]{d} },
       id_of  => sub { return $_[0]{id} },
       items => [{v => $start, id => $id, d => 0}],
   );
  
   my %thread_to = ($id => {d => 0, p => undef, pid => $id});
   while (!$queue->is_empty) {
       my ($ug, $uid, $ud) = @{$queue->dequeue}{qw< v id d >};
       last if $on_goal && $is_goal{$uid} && (!$on_goal->($uid));
       for my $vg ($succs->($ug)) {
           my ($vid, $alt) = ($id_of->($vg), $ud + $dist->($ug, $vg));
           $queue->contains_id($vid)
           ? ($alt >= ($thread_to{$vid}{d} //= $alt + 1))
           : exists($thread_to{$vid})
           and next;
           $queue->enqueue({v => $vg, id => $vid, d => $alt});
           $thread_to{$vid} = {d => $alt, p => $ug, pid => $uid};
       } ## end for my $vg ($succs->($ug...))
   } ## end while (!$queue->is_empty)
  
   return bless {t => \%thread_to, id => $id_of, s => $start}, 'Dijkstra';
} ## end sub dijkstra

As anticipated, a Dijkstra object is returned: here we see that new is necessary because we use bless directly (line 35).

Input Parameters (lines 2 .. 11)

The function requires three parameters:

start: the source node
distance: a function that takes two adjacent nodes and returns the distance between them (or, if you want, the weight of the arc between them);
successors: a function that takes a node and returns a list of nodes that are adjacent to it (see Generic Graph Representation).

A parameter id allows specifying an identifier function, which defaults to the stringification of the node. If provided, it should give a unique identifier for the node. Heh.

This implementation also allows to avoid computing the whole source to other nodes paths, and only concentrate on a list of goals (parameter goal). You can also pass a function on_goal that will be called each time a goal is reached (it is passed to the function). This function is also supposed to return a false value when the last needed goal is reached (allows stopping the algorithm early).

Preparation (lines 13 .. 20)

Variable $id (line 13) is a convenience to make code more compact.

Dijkstra’s algorithm leverages a best-first approach to visiting the graph: this is how it ensures that the first time a node is visited, it is also through an optimal path.

As often with best-first algorithm, it’s necessary to efficiently assess which node is… best, which is why this implementation leverages a priority queue, implemented in PriorityQueue.pm (lines 14 .. 18).

Last, hash %thread_to keeps track of the best path from the source node to any other specific node. Its keys are identifiers, and its values are anonymous hashes with the following keys:

d: the (best/minimum) distance to the source node;
p: when it makes sense, the parent node of a node, i.e. the best hop from a specific node towards the source;
pid: the parent’s identifier (it’s set to the source node id for the source, just to silence path_to).

Algorithm execution (lines 21 .. 33)

Now that all preparations are done, it’s time to run the algorithm!

Like many best-first algorithm (or in general graph-visiting algorithms), the priority queue is explored until it’s empty (line 21).

Items coming out of the queue are anonymous hashes with three keys:

v: the node (opaque object in the graph)
id: identifier for the node
d: distance to the source node.

The current best node is called u, so theese values are saved into u-prefixed variables (line 22).

Line 23 checks if we have some goals set and in case does early interruption of the loop.

Lines 24 .. 32 iterate over all neighbors of u, which we will call v. We calculate its identifier $vid and the possible distance arriving from u as $alt - this is not the real distance, because that will only be available when the node is extracted from the priority queue. The node v, anyway, might already be in the queue, so we test for it (line 26) and act accordingly:

if it’s present in the queue, we have to check if $alt is better or worse than what’s already there, and skip otherwise (line 27 with line 29)
otherwise, we have to skip considering it if we already visited it previously (line 28 with line 29).

If we gest past line 29, then $alt is indeed the best that we can do to go from the source to v, so we enqueue it (line 30). This also works if the node is already in the queue, because it will be updated. Line 31 just records the path, taking note of the best parent.

Summing up

Dijkstra’s algorithm is amazing, I hope you enjoy this compact implementation!

ETOOBUSY 🚀 minimal blogging for the impatient

Dijkstra's Algorithm