## Sunday, April 05, 2015

### Saving Vertical Space

I was reviewing some code I had written for a simple RPG dice algorithm (although there's already a good module for this, Game::Dice) and I realized again that I have a prefererence for functions that can fit on one screen. One strategy is breaking up the code into smaller routines but I sometimes like to compact it vertically as much as possible first.

This function `roll`, given a string of "dice language," should return the results of such a dice roll. An example of this would be "3d10+1" to roll three 10-sided dice and then add 1, or "4d6b3" which says to roll four 6-sided dice and take the best three.

Here's the function before the refactor:

``````sub roll {
my \$input = shift;
die unless \$input =~ /d/;
if ( \$input =~ /(\d*)d(\d+)\s*(\D?)\s*(\d*)/ ) {
my \$num   = \$1 || 1;
my \$die   = \$2;
my \$plus  = \$3;
my \$end   = \$4;
my \$total = 0;
my @dice;
for my \$count ( 1 .. \$num ) {
my \$single = int( rand(\$die) ) + 1;
push @dice, \$single;
print "\$single\n";
}
if ( \$plus eq 'b' ) {
if ( \$end > \$num ) {
\$end = \$num;
}
@dice = sort { \$b <=> \$a } @dice;
\$#dice = \$end - 1;
}
for my \$die (@dice) {
\$total += \$die;
}
if ( \$plus eq '+' ) {
\$total += \$end;
}
elsif ( \$plus eq '-' ) {
\$total -= \$end;
}
elsif ( \$plus eq '*' ) {
\$total *= \$end;
}
elsif ( \$plus eq '/' ) {
\$total /= \$end;
}
return \$total;
}
return;
}
``````

The first thing I did is to delete the first of this pair of lines, which was redundant, because the line that follows also checks the format of the input:

``````die unless \$input =~ /d/;
if ( \$input =~ /(\d*)d(\d+)\s*(\D?)\s*(\d*)/ ) {
``````

But instead of having that big `if` block, I changed it to this:

```return unless \$input =~ /(\d*)d(\d+)\s*(\D?)\s*(\d*)/;
```

Then I combined these:

``````my \$die   = \$2;
my \$plus  = \$3;
my \$end   = \$4;
``````

into this:

```my (\$die,\$plus,\$end) = (\$2,\$3,\$4);
```

Once I decided I didn't need to print each individual die as it was rolled, I could reduce this:

``````for my \$count ( 1 .. \$num ) {
my \$single = int( rand(\$die) ) + 1;
push @dice, \$single;
print "\$single\n";
}
``````

to this:

```push @dice, int(rand(\$die))+1 for ( 1..\$num );
```

Then, I changed this:

``````if ( \$end > \$num ) {
\$end = \$num;
}
``````

To use the postfix `if`:

```\$end =  \$num if \$end > \$num;
```

and this:

``````for my \$die (@dice) {
\$total += \$die;
}
``````

to use postfix `for`:

```\$total += \$_ for @dice;
```

One thing I like to do with an `if/else` chain like this:

``````if ( \$plus eq '+' ) {
\$total += \$end;
}
elsif ( \$plus eq '-' ) {
\$total -= \$end;
}
elsif ( \$plus eq '*' ) {
\$total *= \$end;
}
elsif ( \$plus eq '/' ) {
\$total /= \$end;
}
``````

is to compress it like this:

```if    ( \$plus eq '+' ) { \$total += \$end }
elsif ( \$plus eq '-' ) { \$total -= \$end }
elsif ( \$plus eq '*' ) { \$total *= \$end }
elsif ( \$plus eq '/' ) { \$total /= \$end }
```

Since it's still short in width and the syntax can lined up to be quite readable.

So the final version of the refactored function is:

```sub roll {
my \$input = shift;
return unless \$input =~ /(\d*)d(\d+)\s*(\D?)\s*(\d*)/;
my \$num = \$1 || 1;
my (\$die,\$plus,\$end) = (\$2,\$3,\$4);
my \$total = 0;
my @dice;
push @dice, int(rand(\$die))+1 for ( 1..\$num );
if ( \$plus eq 'b' ) {
\$end =  \$num if \$end > \$num;
@dice = sort { \$b <=> \$a } @dice;
\$#dice = \$end-1;
}
\$total += \$_ for @dice;
if    ( \$plus eq '+' ) { \$total += \$end }
elsif ( \$plus eq '-' ) { \$total -= \$end }
elsif ( \$plus eq '*' ) { \$total *= \$end }
elsif ( \$plus eq '/' ) { \$total /= \$end }
return \$total;
}
```

Now you can make things a lot smaller (see Perl Golf examples) but readability is important to me, and I think this is arguably as readable as the original. I was actually a little surprised that perltidy barely touched the `if/elsif` structure, just screwing up the alignment a little on the first line:

``````if ( \$plus eq '+' ) { \$total += \$end }
elsif ( \$plus eq '-' ) { \$total -= \$end }
elsif ( \$plus eq '*' ) { \$total *= \$end }
elsif ( \$plus eq '/' ) { \$total /= \$end }
``````

The code doesn't strictly adhere to Perl Best Practices, which is something I like to use as a guide for the most part, but perlcritic (which is based on Perl Best Practices) doesn't start to complain until the `cruel` setting, then bringing up things like postfix `if`, postfix `for`, and `unless.`

How would you make it smaller while still maintaining readability?

Faelin said...

Well, one could easily imagine a simple switch statement (or the experimental "given" method):

use switch;
switch (\$plus) {
case "+" { ... }
case "- " { ... }
case "*" { ... }
case "/" { ... }
}

Similarly, one could chain ternary operators:

\$plus eq "+" ? \$total += \$end :
\$plus eq "-" ? \$total -= \$end :
\$plus eq "*" ? \$total *= \$end :
\$plus eq "/" ? \$total /= \$end :
\$total = \$total;

or just use an even shorter eval statement:

eval "\\$total \$plus= \\$end");
//which evals as, for example, "\$total += \$end"

As an aside, it's worth noting that the variable in question should really be written as "op" or "operator", rather than "plus", to avoid confusion.

Anonymous said...

You could always combine the assignment to \$num, \$die, \$plus and \$end with the regex check so unless my (\$num, \$die, \$plus, \$end) = \$input =~ etc, and then add \$num ||= 1 in the block

Unknown said...

You can save another line by combining the regex success test and capture assignment into one statement.

return unless my (\$num,\$die,\$plus,\$end) = \$input =~ /(\d*)d(\d+)\s*(\D?)\s*(\d*)/;
\$num ||= 1;

Unknown said...

Assign to your variables in the unless:
return
unless my (\$num, \$die, \$plus, \$end) = \$input =~ /(\d*)d(\d+)\s*(\D?)\s*(\d*)/;
\$num ||= 1;

change the my @dice; push @dice, lines into a single line with a map:
my @dice = map { (int rand \$die)+1 } (1..\$num);

Anonymous said...

This is a good example of code that is rather hard to improve. It cannot be easily broken down into further subs, yet you know it is too long and could be improved. Of course, there is trickery that will allow you to shave off another line here or there, but I, as you said, Perl Golf is not the point.

Maybe sometimes we just have to live with subs that seem to be too long? The worst part about the original, IMHO, was the waste of horizontal space with some 90% of that one sub being indented twice.

James Smith (Sanger Institute Core Software Services Team) said...

You can reduce it further by:
* combining the assignments
* rewriting the first if block in one line
(note you don't need to sort if num < end anyway)
* you don't need the nasty if else if you make each line a return if!

OK - so the first and second lines are quite nasty and it may not be quite so readable...

sub roll {
return unless \$_[0] =~ /(\d*)d(\d+)\s*(\D?)\s*(\d*)/;
my (\$total,\$num,\$plus,\$end,@dice) = (0,\$1||1,\$3,\$4,map { 1 + int rand \$2 } 1..(\$1||1) );
@dice = splice @{[sort { \$b <=> \$a } @dice]},0,\$end if \$plus eq 'b' and \$end < \$num;
\$total += \$_ for @dice;
return \$total + \$end if \$plus eq '+';
return \$total - \$end if \$plus eq '-';
return \$total * \$end if \$plus eq '*';
return \$total / \$end if \$plus eq '/';
return \$total;
}

James Smith (Sanger Institute Core Software Services Team) said...

I suppose the perl golf soln is along the lines of... it does do a bit more work - but!!

sub roll {
return 0 unless \$_[0] =~ /(\d*)d(\d+)\s*(\D?)\s*(\d*)/;
my (\$t,@d) = (0,sort{\$b<=>\$a}map{1+int rand \$2}1..(\$1||1));
splice @d,\$4||1 if \$3 eq 'b' and \$4||1 < \$1||1;
\$t += \$_ for @d;
return \$t + \$4 if \$3 eq '+';
return \$t - \$4 if \$3 eq '-';
return \$t * \$4 if \$3 eq '*';
return \$t / \$4 if \$3 eq '/';
return \$t;
}

E. Choroba said...

Another 2 lines to 1 line reduction:

@dice = (sort { \$b <=> \$a } @dice)[0 .. \$end - 1];

Anonymous said...

What's described here is not really what I'd consider refactoring, this is just a style cleanup. A refactoring would involve splitting the function up into smaller functions and probably adding tests before doing so to make sure the functionality hasn't changed. Which it has, it no longer die's for an invalid parameter. That could change the operation of the surrounding program.

Anonymous said...
This comment has been removed by a blog administrator.
Will said...

There's a limit to readability. Many of the suggestions in the comments would make the code too unreadable IMHO