ETOOBUSY 🚀 minimal blogging for the impatient
Depth First Visit of a Graph
TL;DR
One fundamental #algorithm for visiting graphs.
One #algorithm implemented in cglib-perl is the depth-first visit of a graph. The implementation leverages the graph representation explained in the previous post Generic Graph Representation, i.e. nodes are considered (mostly) opaque scalars and the relationships between nodes are encapsulated in a function.
Implementation
This implementation is… dense (see DepthFirstVisit.pm):
1 sub depth_first_visit {
2 my %args = (@_ && ref($_[0])) ? %{$_[0]} : @_;
3 my @reqs = qw< start successors >;
4 exists($args{$_}) || die "missing parameter '$_'" for @reqs;
5 my ($start, $succs) = @args{@reqs};
6 my $id_of = $args{identifier} || sub { return "$_[0]" };
7 my $pre_action = $args{pre_action} || undef;
8 my $post_action = $args{post_action} || undef;
9 my $skip_action = $args{skip_action} || undef;
10 my %adjacents = ($id_of->($start) => [$succs->($start)]);
11 my @stack = ([$start, undef]);
12 $pre_action->($start, undef) if $pre_action;
13 while (@stack) {
14 my ($v, $pred) = @{$stack[-1]}; # "peek"
15 my $vid = $id_of->($v);
16 if (@{$adjacents{$vid}}) {
17 my $w = shift @{$adjacents{$vid}};
18 my $wid = $id_of->($w);
19 if (exists $adjacents{$wid}) { # already visited
20 $skip_action->($w, $v) if $skip_action;
21 }
22 else { # new node to be visited
23 $adjacents{$wid} = [$succs->($w)];
24 push @stack, [$w, $v];
25 $pre_action->($w, $v) if $pre_action;
26 }
27 }
28 else {
29 $post_action->($v, $pred) if $post_action;
30 pop @stack;
31 } # finished with this frame
32 }
33 return unless defined wantarray; # don't bother with void context
34 return keys %adjacents if wantarray;
35 return [keys %adjacents] if defined wantarray;
36 }
The function assumes that it will receive key-value pairs, either in a list
or in a hash reference. Line 2 takes care to normalize the inputs into
%args
.
Lines 3 and 4 validate the input parameters: start
and successors
are
mandatory and the function will complain if they are not present in %args
.
Lines 5 to 9 get the inputs or set defaults for the relevant actors in the algorithm. During the visit, three actions can be performed:
pre_action
happens on a node as soon as it is discovered/visited (the two concepts overlap because this function implements a depth-first visit of the graph);skip_action
happens when an already visited node is discovered again, should you need to track this;post_action
happens when the algorithm is about to leave a node.
All nodes are tracked by their identifier. By default, the stringification of the node is the identifier (line 6), but of course you can provide your own action (e.g. to extract a field in a hash).
Hash %adjacents
is used to track the adjacencies of the visited nodes and
doubles down to make sure that we don’t visit the same node multiple times.
It is initialized with the starting node (line 10), which is also put in the
@stack
that is used for the visit (line 11) and of course passed to
pre_action
if defined.
Variable @stack
helps track the visit to the graph without resorting to
recursion. As long as there are items, the top one will be taken and worked
on. Items inside are anonymous arrays that contain two items: a node and,
when possible, the node from where it was discovered. For this reason, the
very first pair has undef
in this second position (line 11).
During the iteration, the last (i.e. top) element is considered, while
still keeping it in the stack (line 14). In a depth-first visit, we only
get rid of an item when we are finished with all its adjacencies, i.e. it
has to be kept in the stack for all that time. This gives us $v
(the
node we are visiting) and $pred
(the node from which we discovered $v
in the first place, if defined).
Tracking of adjacencies is performed through %adjacents
, indexed through
node identifiers as computed by $id_of
(line 10 and line 15) and
containing a list of adjacents for each node, that is progressively consumed
until it’s empty. When this happens, it means that all adjacents for a
node have been visited, and we can get rid of the node itself.
For this reason, the test in line 16 checks for the number of adjacents
present for identifier $vid
: if there are still nodes in the array, then
the first adjacent is considered (line 17) and acted upon: its identifier is
computed (line 18), then it is checked for already having been considered as
an adjacent from another node (line 19) and, in case, the skip_action
is
triggered. If not already discovered, then it is added to %adjacents
with
a list of successors, and a new element is pushed on the stack for the next
iteration. At this time, the pre_action
is fired, because the node has
just been discovered.
When the list of adjacents for a node recorded in %adjacents
is completely
emptied, we enter the else
branch in line 28. Here, it’s time to get rid
of the node $v
: the post_action
is executed (if any), and the item is
definitely removed from @stack
. This node $v
will never enter the
@queue
again, anyway, because it is still associated with an (empty)
anonymous array in %adjacents
.
When the loop is over, pre_action
, skip_action
, and post_action
have
been called in due time. The function will anyway return a list of visited
nodes, either as a plain list (line 34) or as an anonymous array (line 35).
Documentation
You don’t have to re-read through the explanation of how it works to be
able and use depth_first_search
. There’s some documentation in
DepthFirstVisit.pod, with an hopefully clear SYNOPSIS section that
should get you started in no time.
Conclusion
I don’t know how many times I’ve used DepthFirstVisit.pm, copying and pasting in CodinGame… but I know it’s been more than once. If you’re looking for something similar… be my guest!