19 December 2022

Pattern 132 and sequential runs

Pattern 132

We are given a list of integers @list and asked to find a triad i, j, k such that $i < $j < $k, and $list[$i] < $list[$k] < $list[$j]. If none exists we are to say so, and if several exist we are to show the first one we find.

The obvious solution is to iterate over i, j and k in three nested loops, and that will find the answer. For the given examples, it will do so very quickly.

However, I created a 'hard' list comprising 1 .. 10000, 9999. The (only) 132 triad in this list is 1, 10000, 9999, and to do it the obvious way as described above takes 10s of seconds. 

Is there a better way? Given that this has been set as a challenge, the answer must be yes!

First let's note that $list[$j] has to be larger than either $list[$i] or $list[$k]. 

So if we loop $j from 1 to $last - 1, we are looking for a $list[$i] which is less than $list[$j] and occurs where $i < $j. If no such element exists we can move on to the next $j without worrying about $k.

If we do find a possible $list[$i] we then need to see if there is a $list[$k] which is also less than $list[$j] but where $k >$j. If we find one, then we have the solution.

If we still haven't found a solution, then none exists.

For my hard list, this ran in under 10 seconds.

Consecutive array slices

We are given a sorted unique integer array, @array and are asked to find all the slices of this array which comprise consecutive integers, and output the first and last element in each such slice.

Let's iterate through @array and have two persistent variables: $start, the value at the start of  a consecutive subsequence and a state boolean variable $in_slice which indicates whether we are in a consecutive subsequence or not - initialised to false.

So if $j is the index:

  • if not $in_slice, see if $array[$j + 1 == $array[$j] + 1, and if so set $start = $j and set in_slice to true
  • else if $in_slice, see if $array[$j] + 1 == $array[$j] + 1, and if not, record the end of a slice and set in_slice to false

This works fine, except that at the very last array element, $array[$j + 1] won't exist. Obviously we could check for this, but I avoided the issue by adding an element at the end of array equal to the supplied last one + 2, and iterating $j only up to the second last element.

This is a bit messy and I'm looking forward to someone having a more elegant solution.



13 December 2022

Some numbers are special and others are frequent and even

 Special numbers

We are are given a positive integer, $n > 0 and asked to write a script to print the count of all special integers between 1 and $n. An integer is special when all of its digits are unique.

Let's start by thinking of a clever way to quickly identify special numbers in a given range. If you've come up with that - well done! After trying a number of leads I came to the conclusion that none was much better - and most more complicated - than simply testing all the numbers in the range.

How do you test a number for being special? This was my initial thought:

$is_special = 1;
for $digit (0..9) {
    if ($j =~ m|$digit.*$digit|) {
        $is_special = 0;
        last;
    }
}

The regular expression in there says if there is a digit, followed by anything or nothing, followed by the same digit again, then $j is not special.  So we can wrap that in a loop of $j going from 1 to $n and count the specials.  

And that does work. But if $n is large, it takes a while. On my (quite slow) computer, if $n is 1 million, it takes 52 seconds, and for 5 million, 184 seconds.

So I applied a couple of techniques to speed it up.

The first is to note that as we go from 1 to $n we will hit some numbers with trailing zeroes, such as 10, 200, 55000 and so on.   Let's split these numbers into the bit before the zeroes (call it A), and the zeroes themselves (B).

So here is my observation: if A isn't special then we can quickly jump to the number that starts with A + 1 and follows that with B.  So for example when checking 55000, we can skip 55001 to 55999 (because they all start with that unspecial 55) and continue at 56000.  So my loop now looks like this:

$j = 0;
while (1) {
    $j ++;
    last if $j > $test;
    if ($j =~ m|(.+?)(0+)$|) {
        $
j = ($1 + 1) * 10 ** length($2) - 1 if unspecial($1);
    }
    $count ++ unless unspecial($j);
}

So for my 55000 example, the first if splits that into $1 = 55 and $2 = 000, and the next lines jumps $j to 56 * 10^3 - ie 56000 (actually to 55999, because the loop increments $j at the top.) You'll note too that I have taken the test for specialness out to a subroutine, but it's the same method as I showed as my first code example above.

Doing that roughly halves the time: for 1 million it now takes 24 seconds. But that's still quite a long time to wait.  So for my next trick I unfolded the sub unspecial:

sub unspecial {
    # Returns 0 if special, 1 if not

    return 1 if $_[0] =~ m|1.*1|;
    return 1 if $_[0] =~ m|2.*2|;
    return 1 if $_[0] =~ m|3.*3|;
    return 1 if $_[0] =~ m|4.*4|;    
    return 1 if $_[0] =~ m|5.*5|;
    return 1 if $_[0] =~ m|6.*6|;
    return 1 if $_[0] =~ m|7.*7|;
    return 1 if $_[0] =~ m|8.*8|;
    return 1 if $_[0] =~ m|9.*9|;
    return 1 if $_[0] =~ m|0.*0|;

    return 0;
}

Now you might think 'why do that?'   Well, my first answer is that for $n = 5 million, this now takes just 7 seconds as opposed to 184 seconds for the vanilla version, so clearly it's worth it. 

But why that's so much more efficient lies in the guts of Perl, about which I know very little. I am assuming that there is a lot of overhead in working with a regular expression like m|$digit.*$digit| because the regular expression has to be reconsidered 5 million times at run time, whereas in the unfolded version it can be done just once at compile time.

So all in all, a very interesting task.

Frequent and even

I wish my local bus service was more frequent and even, but that's not what we are asked.

Instead the task reads: You are given a list of numbers, @list. Write a script to find most frequent even numbers in the list. If you get more than one number with the same frequency return the smallest. For all other cases, return -1.

So let's pass down the list, ignoring odd numbers and incrementing $freq[$j] when we see an even $j. As we do that, we keep a $max_freq to keep track of the maximum frequency seen, and $max_freq_no being the first number that occurs with that frequency.

Is that all? Well, no, because we can't be sure that we've found the smallest number with that frequency - for example 4, 4, 3, 2, 2 will result in 4 being given as the (wrong) answer. The solution is to sort the list first.

Easy!




07 December 2022

Completing the time and levelling the letters

Completing the time

We are given a time, such as 12:34, with one of the digits replaced by '?'. We are asked what digit when substituted for the '?' results in the maximum valid time.

There are six cases to consider:

  1. The first digit is ? and the second is [0123] - the answer is 2
  2. The first digit is ? and the second is [456789] - the answer is 1
  3. The second digit is ? and the first is 1 - the answer is 9
  4. The second digit is ? and the first is 2 - the answer is 3
  5. The third digit is ? - the answer is 5
  6. The fourth digit is ? - the answer is 9

This is a rather messy set of conditions. The easiest - and perhaps clearest - way of performing the task is just a set of if/elsif clauses using the above logic, and that's what I submitted. I used split to put the characters into in array to make my conditions easy to read - for example $chars[0] eq '?' - but it could equally be done using regular expressions - for example $string =~ m|^...\?| is true if the 4th character, ie third digit, is '?'.

Levelling the letters

We are given a string of characters (lower case letters in the examples) and asked whether the deletion of a single character would mean that the frequencies of occurrence of all the other characters are the same.

The condition will be met if all but one of the characters appear exactly n times, and the other one appears n + 1 times. Removing a single instance of the latter means that all characters appear n times.

The logic I used has three steps:

  1. Create $freq{'x'} as the frequency of 'x' occurring in the string
  2. Find the maximum frequency $max_freq, and the letter $max_char that occurs that frequently
  3. Check if each of the letters is either $max_char or has a frequency of $max_freq - 1.
  4. If they do, then the string meets the requirement; if they don't then it doesn't.

In step 2, there may be several characters having the maximum frequency in which case the string could immediately be disqualified, but there is little to be gained by that as such a string will quickly be ruled out in step 3.

There are however cases where the above will not work.  Consider a string like abbbccc. If we remove a then the frequency of the remaining characters will be the same. None of the examples given illustrates this, but neither does the wording of the task rule it out. Moreover, there are strings like abcde, where the removal of any of the characters will leave the string matching the criterion.

If the cases in the previous paragraph are to be included, step 3 above needs to check for those, and in my submission I have done that. Specifically, if one character appears only once and all the others appear the same number of times (eg abbbccc) , then deletion of the once-appearing character satisfies the criterion, or if all the characters appear only once (eg abcde) then deletion of any one of them meets the criterion.