ETOOBUSY 🚀 minimal blogging for the impatient
Dijkstra's Algorithm
TL;DR
Let’s continue our journey through graph algorithms and see a compact implementation of the Dijkstra’s algorithm!
You can take a look to the [Wikipedia][] page on Dijkstra’s algorithm to know more. Here we take a look at an implementation in Perl that is very compact… and part of the CodinGame library cglib. Let’s just say that here we implement the single source variant, i.e. fixes one node in the graph as the source and computes the best paths towards each other reachable node.
Remember: cglib is optimized for code compactness, not much for readability 😇
The implementation is in Dijkstra.pm, and leverages the same graph representation that we already discussed in the previous post Generic Graph Representation: nodes are considered (mostly) opaque scalars and the relationships between nodes are encapsulated in a function.
Using the results
The implementation pre-computes everything and keeps the results in an object that is the main entry point, so let’s see that first:
package Dijkstra; # repetita juvant... especially with cut-and-paste
use strict;
sub path_to {
my ($self, $v) = @_;
my $vid = $self->{id}->($v);
my $thr = $self->{t}{$vid} || return; # connected?
my @retval;
while ($v) {
unshift @retval, $v;
($v, $vid) = @{$thr}{qw< p pid >};
$thr = $self->{t}{$vid};
}
return wantarray ? @retval : \@retval;
} ## end sub path_to
sub distance_to { return ($_[0]{t}{$_[0]{id}->($_[1])} || {})->{d} }
The interface provides two functions:
path_to
: returns the sequence of nodes to go from the source to a specific node (provided as input);distance_to
: returns the minimum distance between the source and the specific node (provided as input).
There is no new
function because… it’s not necessary.
Main function
Here’s the implementation of the single source variant of the algorithm:
1 sub dijkstra {
2 my %args = (@_ && ref($_[0])) ? %{$_[0]} : @_;
3 my @reqs = qw< start distance successors >;
4 exists($args{$_}) || die "missing parameter '$_'" for @reqs;
5 my ($start, $dist, $succs) = @args{@reqs};
6 my $id_of = $args{identifier} || sub { return "$_[0]" };
7 my %is_goal = map { $id_of->($_) => 1 } @{$args{goals} || []};
8 my $on_goal = scalar(keys %is_goal) ? $args{on_goal} || sub {
9 delete $is_goal{$_[0]};
10 return scalar keys %is_goal;
11 } : undef;
12
13 my $id = $id_of->($start);
14 my $queue = PriorityQueue->new(
15 before => sub { $_[0]{d} < $_[1]{d} },
16 id_of => sub { return $_[0]{id} },
17 items => [{v => $start, id => $id, d => 0}],
18 );
19
20 my %thread_to = ($id => {d => 0, p => undef, pid => $id});
21 while (!$queue->is_empty) {
22 my ($ug, $uid, $ud) = @{$queue->dequeue}{qw< v id d >};
23 last if $on_goal && $is_goal{$uid} && (!$on_goal->($uid));
24 for my $vg ($succs->($ug)) {
25 my ($vid, $alt) = ($id_of->($vg), $ud + $dist->($ug, $vg));
26 $queue->contains_id($vid)
27 ? ($alt >= ($thread_to{$vid}{d} //= $alt + 1))
28 : exists($thread_to{$vid})
29 and next;
30 $queue->enqueue({v => $vg, id => $vid, d => $alt});
31 $thread_to{$vid} = {d => $alt, p => $ug, pid => $uid};
32 } ## end for my $vg ($succs->($ug...))
33 } ## end while (!$queue->is_empty)
34
35 return bless {t => \%thread_to, id => $id_of, s => $start}, 'Dijkstra';
36 } ## end sub dijkstra
As anticipated, a Dijkstra
object is returned: here we see that new
is
necessary because we use bless
directly (line 35).
Input Parameters (lines 2 .. 11)
The function requires three parameters:
start
: the source nodedistance
: a function that takes two adjacent nodes and returns the distance between them (or, if you want, the weight of the arc between them);successors
: a function that takes a node and returns a list of nodes that are adjacent to it (see Generic Graph Representation).
A parameter id
allows specifying an identifier function, which defaults
to the stringification of the node. If provided, it should give a unique
identifier for the node. Heh.
This implementation also allows to avoid computing the whole source to
other nodes paths, and only concentrate on a list of goals (parameter
goal
). You can also pass a function on_goal
that will be called each
time a goal is reached (it is passed to the function). This function is
also supposed to return a false value when the last needed goal is reached
(allows stopping the algorithm early).
Preparation (lines 13 .. 20)
Variable $id
(line 13) is a convenience to make code more compact.
Dijkstra’s algorithm leverages a best-first approach to visiting the graph: this is how it ensures that the first time a node is visited, it is also through an optimal path.
As often with best-first algorithm, it’s necessary to efficiently assess which node is… best, which is why this implementation leverages a priority queue, implemented in PriorityQueue.pm (lines 14 .. 18).
Last, hash %thread_to
keeps track of the best path from the source node
to any other specific node. Its keys are identifiers, and its values are
anonymous hashes with the following keys:
d
: the (best/minimum) distance to the source node;p
: when it makes sense, the parent node of a node, i.e. the best hop from a specific node towards the source;pid
: the parent’s identifier (it’s set to the source node id for the source, just to silencepath_to
).
Algorithm execution (lines 21 .. 33)
Now that all preparations are done, it’s time to run the algorithm!
Like many best-first algorithm (or in general graph-visiting algorithms), the priority queue is explored until it’s empty (line 21).
Items coming out of the queue are anonymous hashes with three keys:
v
: the node (opaque object in the graph)id
: identifier for the noded
: distance to the source node.
The current best node is called u, so theese values are saved into u-prefixed variables (line 22).
Line 23 checks if we have some goals set and in case does early interruption of the loop.
Lines 24 .. 32 iterate over all neighbors of u, which we will call v. We
calculate its identifier $vid
and the possible distance arriving from u
as $alt
- this is not the real distance, because that will only be
available when the node is extracted from the priority queue. The node
v, anyway, might already be in the queue, so we test for it (line 26) and
act accordingly:
- if it’s present in the queue, we have to check if
$alt
is better or worse than what’s already there, and skip otherwise (line 27 with line 29) - otherwise, we have to skip considering it if we already visited it previously (line 28 with line 29).
If we gest past line 29, then $alt
is indeed the best that we can do to go
from the source to v, so we enqueue
it (line 30). This also works if
the node is already in the queue, because it will be updated. Line 31 just
records the path, taking note of the best parent.
Summing up
Dijkstra’s algorithm is amazing, I hope you enjoy this compact implementation!