03 October 2022

Manipulating characters

Reformatting MAC addresses

We are asked to take a MAC address stated as 3 dot-separated groups of 4 characters, and to reformat it as 6 colon-separated groups of 2 characters.  For example:

1234.5678.9abc -> 12:34:56:78:9a:bc

I'm not great believer in one-liners as they are often hard to understand, but in this case I feel that

say $test =~ m|^(..)(..).(..)(..).(..)(..)$| ? 
    qq[\nInput:  $test\nOutput: $1:$2:$3:$4:$5:$6]: 'Invalid format';

does the trick, including formatting the output as Mohammad specifies.

It could equally be done using substr() which we're told is faster, but it's hard to think of a real-life use case where it would make a difference.

Masking characters

Task 2 is stated as 'You are given a list of codes in any random format. Write a script to mask the first four characters (a-z,0-9) and keep the rest as it is.'  

Examination of the examples suggests that we are given an arbitrary string of characters, and are required to change the first 4 instances of [a-z0-9] to 'x'.

Again a solution using a regular expression can do the work in a single line:

s|^(.*?)[a-z0-9](.*?)[a-z0-9](.*?)[a-z0-9](.*?)[a-z0-9]|$1x$2x$3x$4x|

That's a little hard to read, but essentially it breaks the string into the four characters of interest and the (possibly empty) substrings in between them, and then joins the latter up with 'x's.

For this to work I am taking the hard line that any valid input to the task has at least 4 characters matching [a-z0-9]. If that assertion fails, then I think the single regular expression has to be replaced with a loop over 4 attempts to replace one character with 'x'.

But what if one of the first four [a-z0-9] characters in the string is already 'x'? Are we to ignore it or process it:

axbcdef -> xxxxxef or axbcdef -> xxxxdef ?

The first of these fails the task criteria because it has masked the fifth [a-z0-9] character in violation of 'mask the first four and keep the rest'. The second of these fails the criteria because the second character in the string is 'x' before and after and is therefore not masked, violating 'mask the first four'.



No comments:

Post a Comment