Peter's thoughts: November 2022

28 November 2022

All the binaries and find the odd man out

All the binaries

We are asked to find all the binary numbers 'of length $n'. The examples show that 'length $n' means 'containing n digits' and that the desired solution includes those with leading zeroes.

These numbers are simply all the integers from 0 to 2^$n - 1, and we can simply loop over that range, outputting the numbers in binary format using sprintf's binary option. It looks a little odd:

sprintf("%0${n}b", $_)

but if, say $n == 4, that amounts to

sprintf("%04b", $_)

which means 4 binary digits with leading zeroes.

Find the odd man out

We are given a list of strings, such as:

"adc", "wzy", "abc"

From each of these strings we are to form a 'difference array', which contains the alphabetical distance of each letter in the string from the preceding one. So for "adc" the difference array is (3, -1) because 'd' is 3 letters after 'a' in the alphabet, and c is 1 letter before 'd'.

Mohammad suggests that we do this by assigning 0 to 'a', 2 to 'b' and so on, but as we are only concerned by differences we can simply use the ASCII values of the characters, which Perl provides in the function ord().

We are then to told that all of the strings bar one have the same difference array, and asked to find the one that does not. Curiously, perhaps, that is mildly challenging, but here is the solution I adopted.

First, let's represent the difference array not as an array but as a string. So for "adc" I am forming the difference array as the string '3, -1'. Let's call that the difference string.

So now I'm going through all the strings in the list, and for each one ($s) I create a difference string ($diff), and then with each difference string I do this:

$seen{$diff} .= qq[$s=];

The '=' is simply a separator, because I am concatenating all the '$s=' that have the same $diff into a single member of %seen. It will end up looking like this:

$seen{'3, -1'} -> 'adc=wzy='
$seen{'1, 1'} -> 'abc='

I then loop over the elements of %seen and look for the value that matches:

$seen{$s} =~ m|^([^=]*)=$|

or in words, a value that starts with a run of characters not containing '=' followed by '=' followed by nothing else. If you look at the values of %seen shown above, you'll see that the only one that matches that pattern (ie having a single '=') is the odd one out.

This only works if Mohammad's assertions are correct, that is, there is only one odd man, all the strings are the same length and only contain lower case a-z letters. I could (and would in real life) have tested that those assumptions were correct.

21 November 2022

Flipping easy and distributing fairly

Flipping binary numbers

Our first task this week is to take a positive integer, render it in binary, swap the zeroes for ones and vice versa, and output the result in decimal.

It's tempting to use the Perl bitwise not operator (~) to flip the bits, but inspection of the examples shows that's not quite what's wanted. The reason is that while we normally express 4, for example, as 100 in binary, the computer consider it to be 0000000000000100. Applying the not operator to that gives us 1111111111111011, which isn't the desired answer.

Now there are various ways around that, but note that the supplied number will always start with a 1 in its simple binary representation, so its flipped value will start with 0. And in its 16 (or whatever) bit representation, that will be preceded by a lot of 1s. So as it's just a string, we can safely just strip off all the leading 1s using s|^1+||, and the algorithm boils down to just two lines:

$flip = sprintf('%b', ~ $test); # render 'not' the test in binary
$flip =~ s|^1+||; # strip off any leading 1s

Making an even distribution

For this task we are given an ordered list of integers and asked to make 'moves' until all the members of the list are equal. A move comprises subtracting 1 from any cell in the list and adding it to an immediately neighbouring cell.

We are asked to return the number of moves requires, or -1 if the task cannot be achieved.

The first thing to note is that if there are c cells and the sum of their contents is s, then an answer can only be possible if s is an integer multiple of c. If it is, the cells will all end up containing s/c. So we can immediately return -1 if s is not a multiple of c.

It is clear then that it is always possible to even out the distribution using 'moves' as described above.

The algorithm I have used goes as follows.

First check that the input meets the s/c criterion, and if not, return -1.

Start at cell 0 and work along to the second-last cell (because if we get all those right then the last cell must be right too).

We know what the target value in each cell is - it's s/c. So if the cell we are looking at:

contains more than the target, we successively reduce it by 1 and increment the following cell by 1 until the cell we are looking at contains the target value
contains less than the target, we successively reduce it by 1 and increment the following cell by 1 until the cell we are looking at contains the target value
contains exactly the target, we do nothing

We count the moves as we go, and the statement of the task only requires us to output that number. So that's it.

However, there is slight wrinkle. Although that algorithm gives numbers of moves which agree with the examples, it is the case that some of the moves may temporarily leave a cell containing a negative number. For example, for Mohammad's first example it gives this:

Input: (1, 0, 5)
Output: 4

Move #1: 2, -1, 5
Move #2: 2, 0, 4
Move #3: 2, 1, 3
Move #4: 2, 2, 2

The rules don't say that this is not allowed, so I am boldly claiming that it is valid.

I also believe, though won't show the proof here, that:

this algorithm always comes up with the minimum number of moves,
the moves can always be rearranged such that no cell goes negative, and
there are multiple alternative sets of a greater number of moves to get to the desired equal distribution.

14 November 2022

The twice largest and number of cuties

The twice largest

We are given a list of integers and asked to write a script to find out whether the largest item in the list is at least twice as large as each of the other items.

The simplest way to do this is to reverse-sort the list, and then the condition is simply whether the first item in the sorted list is no less than double the second.

Given a modern computer this will complete in milliseconds in even quite a long list. But when I started programming over 50 years ago sorting was something to be avoided as it took so long, especially if the list of items to be sorted was more than would fit in memory. Back then, 128 kB was a large computer, so sorting things like a payroll took hours, with numerous stops to change tapes (about 40cm in diameter) on tape drives which had to be cleaned every hour with cotton buds and isopropanol.

So a better algorithm - for 1970 - would be to make a single pass through the list, checking each value to see if is the largest seen so far, and also if it is the second largest, like this:

if ($this > $largest) {
$second = $largest;
$largest = $this;
}

And again, the test after completing the pass if whether the largest is greater than or equal to the second largest.

I've coded both of these methods in my submission.

How many cute lists?

A cute list of order n comprises the first n positive integers arranged in a list such that each number is either a factor or a multiple of its position in the list. We are asked to compute how many cute lists exist for 1 <= $n <= 14.

The first solution that comes to mind is to generate all the permutations of 1 .. $n, and check them all for cuteness. It tried that, but once I got to order 12 it was taking longer to compute than it took for me to eat my lunch, so I decided to try a different tack.

I wrote a recursive subroutine - I called it find_cute - that starts with an incomplete cute list and adds one more element. Suppose we're adding the 6th element in an order 12 list. There are a number of options for this element: it could be 1, 2 or 3 because those are factors of 6, and it could be 6 or 12 because these are multiples of 6 - but it can't be any of these if they are already in use for the 5 preceding elements.

If we do find a valid element - or several - we add it to our list and recursively call find_cute - maybe several times - to get the next element. If find_cute successfully gets all 12 elements filled then we have a solution.

If I've done this correctly there are 24679 possible cute lists of order 15.

07 November 2022

Capital test and ambiguous encoding

Proper capitalisation

Task 1 this week gives us a string comprising capital and lower case letters and asks us to determine whether it follows one of these patterns of capitalisation: Challenge, challenge, CHALLENGE.

Generally I steer clear of hard-to-follow one-liners, but this one is fairly simple:

$test =~ m/^[A-Z]?([a-z]*|[A-Z]*)$/ ? 1 : 0;

In words, the string follows the rule if:

it starts with zero or one capital letter, and
it continues with either zero or more capital letters, or zero or more lower case letters.

So that wasn't too hard, was it?

A strange encoding

Task 2 gives us a string of digits and asks us to decode it left to right, assuming 1 means A, 2 means B down to 26 means Z. Of course this is ambiguous, because 11 could mean AA or it could be the 11th letter, which is L. So we are to find all the possible decodings and present them in alphabetical order.

If we start at the beginning of the string, the first digit must be 1-9, and that could correspond to A-I. But of course there is an alternative decoding if the first two digits are in the range 10 to 26, that could mean J to Z.

So as we proceed along the string, at each step there are potentially two paths for us to explore, and for a long string that means a large, and initially unknown, answers.

The easy way to handle problem like this is recursion. Let's have a subroutine 'analyse' that takes two arguments: $so_far and $test. $so_far contains the answer we have built up already, and $test is the rest of the input string that we haven't got to yet.

Our initial call is therefore analyse('', $test) - we haven't found anything so far and we have the whole of $test to examine.

Subroutine analyse does one of three things:

it takes the first digit off $test, converts it to a character,

appends it to $so_far
if there are still more characters in $test then it recursively calls analyse(the new $so_far, the new $test)
or if not, it's found an answer (the new $so_far) which it stores in a global hash, and returns

it then takes the first two digits of (the input value of) $test and

if they are in the range 10 to 26, it appends the corresponding character to (the input value of) $so_far, and
if there are still more characters in $test then it calls analyse(the new $so_far, the new $test),
or if $test is exhausted, it's found an answer (the new $so_far) which it stores in a global hash, and returns

That will find all the possible decodings of the input string. They are all stored as keys in the hash %answers, and all that remains is to sort the hash by key and output the answers.