Context
PlaidCTF was pretty neat and a bit wacky, as always. I managed to tackle a few challenges, one of them being this slice of insanity right here.
Disclaimer: I am not an APL programmer, though I really enjoy the language from a conceptual point of view (terseness and expressivity through notation); and I have been called a lunatic on many occasions.
In this post, I will give you a brief overview of APL basics, since we will be using the language itself to solve the challenge. You may skip over it and go straight for the solution description if you’re not interested (though you would be missing out on some crazy cool language features that APL provides).
At its core, APL is a functional programming language, dealing with functions and operators (higher-order functions which take other functions as arguments). One of the learning difficulties most often cited comes not from the high-level abstractions (I’m sure most of you are comfortable with Python’s own map
, lambda
, filter
and so on), but the notation used. APL has its own set of 50 (close to 80 in modern implementations) Unicode characters, each denoting a function or operator, making the code very dense (and the language very adequate for tasks such as code golfing) and somewhat challenging to read for newcomers.
A quick introduction to APL
If you’re up for some brain twisters, accompanied by the soothing voice of John Scholes, then I highly recommend these three videos. If you need a more gentle introduction, then keep reading.
Functions
A function in APL is denoted by a pair of curly braces ({}
) containing an expression. Functions come in two flavors, monadic (or unary, single argument) and dyadic (binary, two arguments).
A monadic function has a single right-hand side argument, denoted by the formal parameter ⍵
. A dyadic function also has a left-hand side argument denoted by ⍺
.
Arrays
APL works beautifully with arrays of any dimension. Most functions will work over arrays without any extra work from the programmer.
You may have noticed that we used the same rho symbol to accomplish two related, but different things. Most symbols in APL denote both a monadic and a dyadic function. Depending on the presence of a left argument, functions will behave differently, which adds more difficulty in understanding APL code.
Operators
These higher-order functions take other functions as arguments and applies them over data in various ways. This concept shouldn’t be new to you, since Python’s map
does precisely that.
For instance, this is how we square all the numbers of a list in Python:
…and APL:
APL has many more operators and functions, but in the name of brevity I won’t cover them all. I’ll introduce some new ones as needed while describing the challenge solution.
The challenge
The code which we had to reverse was this:
Looks pretty daunting at first, but if we don’t stray from basic principles, we can keep our sanity in check.
The ⎕IO←0⋄
seems superfluous. The diamond symbol is a statement separator; on its own, the statement reads a number from the standard input. We can safely move over this part.
From afar, the code can be simplified as follows: 'some result string' {some dyadic function} 'user input'
. That’s all there is to it. My approach is to construct the inverse of this function such that, when applied on the result string, yields the original input. With that in mind, let’s break the code down.
String equality
How do we check to see if two strings are identical in APL? Let’s construct such a function step by step:
It’s starting to look a bit like the ‘success’/’fail’ function in our challenge, but it’s not quite there. The challenge maker chose a slightly more convoluted way of comparing strings. Instead of performing a logical AND over the resulting array, it sums the array and compares the sum with the length of the string, as in the following example:
With the string comparison out of the way, we can further simplify our challenge by removing the ‘success’/’fail’ function and the equality test. Since the encoded flag was used only for comparison, we can store it away and convert the challenge into a monadic function.
Conversions
We’re down to the following code:
Again, staying with basic principles, analyzing APL functions can be done iteratively starting from the right. Our input string is denoted by the formal parameter ⍵
. UCS
converts the string into an array of corresponding decimal ASCII values. ⌽
reverses the the array.
The next function in line is very interesting. Let’s reconstruct it:
What this code essentially does is f(x) = x ^ (x >> 1)
; this XOR shift has some interesting properties, one being that successive elements differ in exactly one bit; we’ll see another interesting property when we get to (finally) solving the challenge.
I’ll skip the middle part in order to focus on another conversion function, namely {+/⍵/⌽2*⍳⍴⍵}
. I’ll leave it as an exercise to reconstruct it as before. This function can be read as: “The sum over omega over the reversed array of the powers of two”, which is just a fancy way of saying binary-to-decimal. However, the ability to decode a binary number is implemented as primitive in APL:
Reshaping transposed rotations
The middle part which we haven’t covered is this: 33 8⍴(8×⍴⍵)⍴7⌽⍉(-⌊(⍴'f0xtr0t')÷2)⌽⍉11 24⍴∊
Again, in right-to-left order, the unary ∊
flattens our array of arrays of 8 bits into a single array of bits. Then we reshape these bits into an 11-by-24 matrix. This matrix then gets transposed, then rotated with a left argument given by (-⌊(⍴'f0xtr0t')÷2)
. We can simply evaluate this part and see that we get the value -3
(being the length of the string “f0xtr0t”, divided by two, rounded down and sign-inverted).
This in turn goes through another transposition and another rotation, this time of 7
. This result first gets reshaped into a flat array of 264 bits (the result of (8×⍴⍵)⍴
), which again gets reshaped into a 33 by 8 matrix. Lastly, each 8bit row of this matrix gets decoded back into decimal, giving us our resulting encoded flag.
Reversal
To sum up what the encoding function did so far:
- Convert the string from array of characters to array of corresponding ASCII decimals.
- Reverse the array.
- Convert each value to an array of 8 bits and apply
x^(x>>1)
over each. - Reshape into an 11-by-24 matrix.
- Transpose and rotate by -3.
- Transpose and rotate by 7.
- Reshape into a 33-by-8 matrix.
- Convert back to decimal. Add 13.
- Convert back to “ASCII” (actually, APL uses Unicode by default).
All we need to do is go through the steps in reverse, making sure to change the order in which transpositions and rotations are performed.
We’re almost there, but we’ve hit a small bump in the road. How do we reverse the xorshift operation? Your first instinct could be to construct a lookup table for all 8 bit values (and that’s how I did it during the CTF), but there’s a more elegant solution.
“Reversing” the xorshift
I don’t know if there is a simple form for the inverse xorshift^(-1)
, but I can experiment a bit with the function as is:
As we can see, if we apply the function 8 times in a row, we get right back where we started. This means that if we apply it 7 times, it would be as if we applied the inverse of the function. We can also observe that for an N bit number, xsh^(N-1) = xsh^(-1)
.
End of the APLine
With this final piece, we have everything we need to construct our inverse challenge function:
So there you have it, a solution in APL which can fit in one tweet.
I hope you had fun reading (and perhaps even trying the code at TryAPL). Protip: the backtick (`)
can be used to enter APL’s special characters (i.e. (`w)
is ⍵
)
Big thanks to PPP for the CTF!