TL;DR

On with TASK #2 from the Perl Weekly Challenge #093. Enjoy!

The challenge

You are given binary tree containing numbers 0-9 only. Write a script to sum all possible paths from root to leaf.

The challenge provides a couple of examples to make it clearer what it means that we are given a binary tree. These are two examples:

     1
    /
   2
  / \
 3   4
     1
    / \
   2   3
  /   / \
 4   5   6

The questions

One thing that pops up is that any tree with a depth of three or more cannot be complete if we have to stick with the example pictures we have. E.g. the second example clearly shows that 5 can only be connected to either 2 or 3, so the other one cannot have two children. C’est la vie, I guess, or a more complicated example would be needed.

So… we will move on with a few assumptions:

  • starting from the top, lines alternate between values and connectors
  • only diagonal movements are allowed, to nearby cells

The solution

I initially thought to transform the string into a tree, then run a visit over the tree and calculate the sum.

Then I thought… why bother? Let’s calculate the result on the way!

As it often happens with tree visits, a recursive function can save a lot of time and mental energy. So we first introduce a little wrapper to help us with parameter unpacking and setup:

sub sum_path ($input) {
   my @rows = map { [ split m{}mxs ] } split m{\n}mxs, $input;
   my $root = 0;
   $root++ while $rows[0][$root] eq ' ';
   return _sum_path_r(\@rows, 0, $root, 0);
}

The input is turned into a sequence of rows, alternating values and connectors. Each row is further split into single characters.

Then, we look for the position of the root node, using $root as an index through the first (top) row.

At this point, we’re ready to call the recursive function, with the following parameters:

  • a reference to the whole field;
  • the index of the current row (we start at 0, of course);
  • the index of the current column (we start where we found $root);
  • the sum so far from the parent, which starts at 0 because there are no parents at this stage.

The recursive function is the following:

sub _sum_path_r($rows, $rid, $cid, $parent) {
   my $so_far = $parent + $rows->[$rid][$cid];
   my $sub_sum = 0;
   if ($rid < $#$rows) { # there can be something more
      $rid++;
      $sub_sum += _sum_path_r($rows, $rid + 1, $cid - 2, $so_far)
         if $cid > 0 && $rows->[$rid][$cid - 1] ne ' ';
      $sub_sum += _sum_path_r($rows, $rid + 1, $cid + 2, $so_far)
         if $cid < $#{$rows->[$rid]} && $rows->[$rid][$cid + 1] ne ' ';
   }
   return $sub_sum || $so_far;
}

Variable $so_far keeps track of the sum… so far, including the node we are on. For this reason, it sums whatever comes from the parent and the value we are currently at.

Then we have to figure out whether we have children nodes or not. If we do have children, we will amass the sums coming from them into $sub_sum; otherwise, we will just use $so_far. This explains the otherwise weird return statement at the end.

If there are additional rows, we just check whether the current node has children and, where it has any, we recurse, moving ahead in the rows and selecting the right column.

I guess it’s all for today… good bye and stay safe!