ETOOBUSY š minimal blogging for the impatient
PWC087 - Largest Rectangle
TL;DR
On with TASK #2 from the Perl Weekly Challenge #087. Enjoy!
The challenge
You are given matrix
m x n
with0
and1
. Write a script to find the largest rectangle containing only1
. Print0
if none found.
The questions
One question for me would be: what about 1 x 1
rectangle? Why are they
forbidden? This bugs me a bit to be honest.
Another observation is that largest is not defined exactly here. Is it by area? What about two rectangles with the same area?
The solution
There must be a smart way to solve this challenge.
But itās late and I need to sleepā¦ so Iāll revert to solution 0, which is to at least get the job done in lower programmer time, at the expense of scalability. Iāll settle with something that is most probably $O(n^4)$, more or less.
Whatās the idea
The idea is: starting from the biggest size of all, look for a matching rectangle. If foundā¦ print it. Otherwise, go on.
Now, thereās a couple of issues here:
- as said in the questions, we hereby declare that largest means largest area;
- we want to examine in decreasing area size in order to get all possible candidates, so we need to keep some kind of ordering of the candidates;
- we want to generate candidates as we exclude bigger ones, not pre-generate all of them beforehand.
One key insight here is that if, at a certain point, we examine a candidate area size deriving from $m \cdot n$ rows times columns, then the two possible further candidates that can spawn from it are $(m - 1) \cdot n$ and $m \cdot (n - 1)$. This does not mean that either one will be the next in line, just that they are the ābest possibleā further candidates that we can make out of it.
This takes care of the last bullet, i.e. generating candidates as we need more (if we find a match, we will not need further candidates). To check the candidates in decreasing order, we will need to keep them somehow sorted. This is a perfect job for a Priority Queue, in particular a Max Priority Queue.
I knew that the time spent on the Algorithms course was not wasted, after all š
Solution sketch
So letās go to the code!
1 sub largest_rectangle ($M) {
2 my $rows = $M->@* or return;
3 my $cols = $M->[0]->@* or return;
4 my $mpq = PriorityQueue->new(before => sub {$_[0]{size} > $_[1]{size}});
5 $mpq->enqueue({rows => $rows, cols => $cols, size => $rows * $cols});
6 my %have_done; # this avoids double searching the same rows x columns
7 while (! $mpq->is_empty) {
8 my ($rl, $cl) = $mpq->dequeue->@{qw< rows cols >};
9 for my $rs (0 .. $rows - $rl) {
10 for my $cs (0 .. $cols - $cl) {
11 my @candidate = ($rs, $rl, $cs, $cl);
12 return \@candidate if is_full_rectangle($M, @candidate);
13 }
14 }
15
16 # insert further candidates, if possible
17 for my $delta ([-1, 0], [0, -1]) {
18 my ($rn, $cn) = ($rl + $delta->[0], $cl + $delta->[1]);
19 next unless $rn * $cn > 1; # no 1x1 apparently!
20 if ($have_done{"$rn-$cn"}) {
21 delete $have_done{"$rn-$cn"}; # spare some memory...
22 }
23 else {
24 $have_done{"$rn-$cn"} = 1;
25 $mpq->enqueue({rows => $rn, cols => $cn, size => $rn * $cn});
26 }
27 }
28 }
29 return; # found nothing, apparently!
30 }
Lines 2 and 3 take care of some corner cases where the Matrix does not exist. Nothing too fancy here.
Line 4 introduces our hero: the Max Priority Queue. The implementation is
taken verbatim from PriorityQueue.pm in my CodinGame library,
and we set the key before
to a function that allows the code to figure out
what should comeā¦ before. In the case of a Max Priority Queue, higher
values of size
have to come before lower ones.
The first candidate in the queue is an attempt to check whether the whole
input matrix complies with the requirement (line 5). This is, of course, the
maximum we can test out of $M
.
The loop in lines 7 through 28 checks out all those sizes, in descending
order (thanks to the priority queue). Line 8 takes the next candidate in
line, and all its possible overlappings onto $M
are tried out (lines 9
through 14). If any of these placements is a matchā¦ we return it straight
away (line 12), otherwise we move on.
Hence, line 17 is only reached if nothing has been found so far; itās time to add more candidates. As discussed, we check one less row and same columns, and one less column and same rows (line 17).
Candidates whose size is too small are discarded (line 19). It beats me why a $1 \cdot 1$ rectangle cannot be considered, but apparently itās not allowed by the examples.
Note one thing: if we start from a $3 \cdot 3$ matrix, there are two ways to
land on $2 \cdot 2$ candidate size: arriving from $3 \cdot 2$ and arriving
from $2 \cdot 3$. To avoid repeating tests twice, then, we keep a guard in
hash %have_done
and skip inserting the candidate if it is already present
(line 20).
If we end up testing everythingā¦ and not finding a solution, we eventually reach line 29 and returnā¦ nothing.
The whole code
As usual, if you want to take a look at all of itā¦ enjoy!
#!/usr/bin/env perl
use 5.024;
use warnings;
use experimental qw< postderef signatures >;
no warnings qw< experimental::postderef experimental::signatures >;
use autodie;
main(shift);
sub main ($filename = undef) {
my $fh =
!defined $filename ? \*DATA
: $filename eq '-' ? \*STDIN
: do { open my $fh, '<', $filename; $fh };
my @matrix;
while (<$fh>) {
my @row = split m{\s+};
push @matrix, \@row;
shift @row; # "["
pop @row; # "]"
} ## end while (<$fh>)
if (my $lr = largest_rectangle(\@matrix)) {
my ($rs, $rl, $cs, $cl) = $lr->@*;
local $, = ' ';
say {*STDOUT} '[', $matrix[$_]->@[$cs .. $cs + $cl - 1], ']'
for $rs .. $rs + $rl - 1;
} ## end if (my $lr = largest_rectangle...)
else {
say {*STDOUT} 0;
}
} ## end sub main ($filename = undef)
sub largest_rectangle ($M) {
my $rows = $M->@* or return;
my $cols = $M->[0]->@* or return;
my $mpq = PriorityQueue->new(before => sub {$_[0]{size} > $_[1]{size}});
$mpq->enqueue({rows => $rows, cols => $cols, size => $rows * $cols});
my %have_done; # this avoids double searching the same rows x columns
while (! $mpq->is_empty) {
my ($rl, $cl) = $mpq->dequeue->@{qw< rows cols >};
for my $rs (0 .. $rows - $rl) {
for my $cs (0 .. $cols - $cl) {
my @candidate = ($rs, $rl, $cs, $cl);
return \@candidate if is_full_rectangle($M, @candidate);
}
}
# insert further candidates, if possible
for my $delta ([-1, 0], [0, -1]) {
my ($rn, $cn) = ($rl + $delta->[0], $cl + $delta->[1]);
next unless $rn * $cn > 1; # no 1x1 apparently!
if ($have_done{"$rn-$cn"}) {
delete $have_done{"$rn-$cn"}; # spare some memory...
}
else {
$have_done{"$rn-$cn"} = 1;
$mpq->enqueue({rows => $rn, cols => $cn, size => $rn * $cn});
}
}
}
return; # found nothing, apparently!
}
sub is_full_rectangle ($M, $rs, $rl, $cs, $cl) {
for my $r ($rs .. $rs + $rl - 1) {
my $Mr = $M->[$r];
for my $c ($cs .. $cs + $cl - 1) {
return unless $Mr->[$c];
}
}
return 1;
}
package PriorityQueue; # Adapted from https://algs4.cs.princeton.edu/24pq/
use strict;
sub contains { return $_[0]->contains_id($_[0]{id_of}->($_[1])) }
sub contains_id { return exists $_[0]{item_of}{$_[1]} }
sub is_empty { return !$#{$_[0]{items}} }
sub item_of { exists($_[0]{item_of}{$_[1]}) ? $_[0]{item_of}{$_[1]} : () }
sub new; # see below
sub dequeue { return $_[0]->_remove_kth(1) }
sub enqueue; # see below
sub remove { return $_[0]->remove_id($_[0]{id_of}->($_[1])) }
sub remove_id { return $_[0]->_remove_kth($_[0]{pos_of}{$_[1]}) }
sub size { return $#{$_[0]{items}} }
sub top { return $_[0]->size ? $_[0]{items}[1] : () }
sub top_id { return $_[0]->size ? $_[0]{id_of}->($_[0]{items}[1]) : () }
sub new {
my $package = shift;
my $self = bless {((@_ && ref($_[0])) ? %{$_[0]} : @_)}, $package;
$self->{before} ||= sub { return $_[0] < $_[1] };
$self->{id_of} ||= sub { return ref($_[0]) ? "$_[0]" : $_[0] };
my $items = $self->{items} || [];
@{$self}{qw< items pos_of item_of >} = (['-'], {}, {});
$self->enqueue($_) for @$items;
return $self;
} ## end sub new
sub enqueue { # insert + update in one... DWIM
my ($is, $id) = ($_[0]{items}, $_[0]{id_of}->($_[1]));
$_[0]{item_of}{$id} = $_[1]; # keep track of this item
my $k = $_[0]{pos_of}{$id} ||= do { push @$is, $_[1]; $#$is };
$_[0]->_adjust($k);
return $id;
} ## end sub enqueue
sub _adjust { # assumption: $k <= $#$is
my ($is, $before, $self, $k) = (@{$_[0]}{qw< items before >}, @_);
$k = $self->_swap(int($k / 2), $k)
while ($k > 1) && $before->($is->[$k], $is->[$k / 2]);
while ((my $j = $k * 2) <= $#$is) {
++$j if ($j < $#$is) && $before->($is->[$j + 1], $is->[$j]);
last if $before->($is->[$k], $is->[$j]); # parent is OK
$k = $self->_swap($j, $k);
}
return $self;
} ## end sub _adjust
sub _remove_kth {
my ($is, $self, $k) = ($_[0]{items}, @_);
die 'no such item' if (!defined $k) || ($k <= 0) || ($k > $#$is);
$self->_swap($k, $#$is);
my $r = CORE::pop @$is;
$self->_adjust($k) if $k <= $#$is; # no adjust for last element
my $id = $self->{id_of}->($r);
delete $self->{$_}{$id} for qw< item_of pos_of >;
return $r;
} ## end sub _remove_kth
sub _swap {
my ($self, $i, $j) = @_;
my ($items, $pos_of, $id_of) = @{$self}{qw< items pos_of id_of >};
my ($I, $J) = @{$items}[$i, $j] = @{$items}[$j, $i];
@{$pos_of}{($id_of->($I), $id_of->($J))} = ($i, $j);
return $i;
} ## end sub _swap
1;
package main;
__DATA__
[ 0 0 0 1 0 0 ]
[ 1 1 1 0 0 0 ]
[ 0 0 1 0 0 1 ]
[ 1 1 1 1 1 0 ]
[ 1 1 1 1 1 0 ]