13 December 2022

Some numbers are special and others are frequent and even

 Special numbers

We are are given a positive integer, $n > 0 and asked to write a script to print the count of all special integers between 1 and $n. An integer is special when all of its digits are unique.

Let's start by thinking of a clever way to quickly identify special numbers in a given range. If you've come up with that - well done! After trying a number of leads I came to the conclusion that none was much better - and most more complicated - than simply testing all the numbers in the range.

How do you test a number for being special? This was my initial thought:

$is_special = 1;
for $digit (0..9) {
    if ($j =~ m|$digit.*$digit|) {
        $is_special = 0;
        last;
    }
}

The regular expression in there says if there is a digit, followed by anything or nothing, followed by the same digit again, then $j is not special.  So we can wrap that in a loop of $j going from 1 to $n and count the specials.  

And that does work. But if $n is large, it takes a while. On my (quite slow) computer, if $n is 1 million, it takes 52 seconds, and for 5 million, 184 seconds.

So I applied a couple of techniques to speed it up.

The first is to note that as we go from 1 to $n we will hit some numbers with trailing zeroes, such as 10, 200, 55000 and so on.   Let's split these numbers into the bit before the zeroes (call it A), and the zeroes themselves (B).

So here is my observation: if A isn't special then we can quickly jump to the number that starts with A + 1 and follows that with B.  So for example when checking 55000, we can skip 55001 to 55999 (because they all start with that unspecial 55) and continue at 56000.  So my loop now looks like this:

$j = 0;
while (1) {
    $j ++;
    last if $j > $test;
    if ($j =~ m|(.+?)(0+)$|) {
        $
j = ($1 + 1) * 10 ** length($2) - 1 if unspecial($1);
    }
    $count ++ unless unspecial($j);
}

So for my 55000 example, the first if splits that into $1 = 55 and $2 = 000, and the next lines jumps $j to 56 * 10^3 - ie 56000 (actually to 55999, because the loop increments $j at the top.) You'll note too that I have taken the test for specialness out to a subroutine, but it's the same method as I showed as my first code example above.

Doing that roughly halves the time: for 1 million it now takes 24 seconds. But that's still quite a long time to wait.  So for my next trick I unfolded the sub unspecial:

sub unspecial {
    # Returns 0 if special, 1 if not

    return 1 if $_[0] =~ m|1.*1|;
    return 1 if $_[0] =~ m|2.*2|;
    return 1 if $_[0] =~ m|3.*3|;
    return 1 if $_[0] =~ m|4.*4|;    
    return 1 if $_[0] =~ m|5.*5|;
    return 1 if $_[0] =~ m|6.*6|;
    return 1 if $_[0] =~ m|7.*7|;
    return 1 if $_[0] =~ m|8.*8|;
    return 1 if $_[0] =~ m|9.*9|;
    return 1 if $_[0] =~ m|0.*0|;

    return 0;
}

Now you might think 'why do that?'   Well, my first answer is that for $n = 5 million, this now takes just 7 seconds as opposed to 184 seconds for the vanilla version, so clearly it's worth it. 

But why that's so much more efficient lies in the guts of Perl, about which I know very little. I am assuming that there is a lot of overhead in working with a regular expression like m|$digit.*$digit| because the regular expression has to be reconsidered 5 million times at run time, whereas in the unfolded version it can be done just once at compile time.

So all in all, a very interesting task.

Frequent and even

I wish my local bus service was more frequent and even, but that's not what we are asked.

Instead the task reads: You are given a list of numbers, @list. Write a script to find most frequent even numbers in the list. If you get more than one number with the same frequency return the smallest. For all other cases, return -1.

So let's pass down the list, ignoring odd numbers and incrementing $freq[$j] when we see an even $j. As we do that, we keep a $max_freq to keep track of the maximum frequency seen, and $max_freq_no being the first number that occurs with that frequency.

Is that all? Well, no, because we can't be sure that we've found the smallest number with that frequency - for example 4, 4, 3, 2, 2 will result in 4 being given as the (wrong) answer. The solution is to sort the list first.

Easy!




No comments:

Post a Comment