Posted on 17th June 2015
Round 2 came and went (and Round 3 and the "distributed" round). As an effort to practice with C#, I solved these problems in C# (definitely slower than using Python, and actually, the end running time is not that much quicker, for these sort of algorithmic / numerical problems).
There's a definite step-up in difficulty here! Maybe I kid myself I could have done problem A and some "small" problems quickly... that people did them all in 150 minutes is quite scary. The Official Contest Analysis is rather brief (from G+, it seems the organisers were busy with the distributed round; or maybe, if we've come this far, shorter answers are all we need ;-> ) I think my background in pure Maths shows here: I found problem D quite easy, as I found analysing the special cases off-line (pen and paper) to be not so hard. Problems B and C (in "large" form) were hard, I thought. (I missed the trick with problem B; and using an off-the-shelf Graph package is obviously the fast way to do Problem C.)
See the code on GitHub.
Problem A, Pegman: We have a grid and each cell can contain one of four directions (UP, DOWN, LEFT, RIGHT) or nothing. Pegman starts in a cell: if contains no direction he stays there. Otherwise he moves in the given direction, continuing in that direction until he exits the grid, or hits another direction symbol, in which case he changes direction. Can you alter (but not add or remove) some of the direction signs so Pegman never exits the grid, regardless of starting point?
This is quite easy, if you see the correct way to attack it. Pegman will exit the grid if there is a direction sign pointing directly out of the grid (e.g. a LEFT sign where all the cells to the left are empty: ".<" or "<" or "...<" etc.)) We must change all of these directions and we have to change them so that they don't point out of the grid, i.e. they do point towards another direction sign. If we can do this, then there are no cells pointing out of the grid, and so Pegman can't exit.
So check each direction: - Does it point out of the grid? If so, count it, and... - Check it it can be made to point at another direction. If not, report "IMPOSSIBLE", other continue.
Problem B, Kiddie Pool: Given input pipes of flow rate \(R_i\) and temperature \(C_i\). You want to run these for an amount of time (different maybe for each \(i\)) to make volume \(V\) at temperature \(X\). If you can do this, what's the least amount of time you need (the pipes run in parallel). So we want to find the smallest \(T>0\) so we can find \(0 \leq t_i \leq T\) with \( \sum_i t_i R_i = V\) and (a little maths shows) \( \sum_i t_i R_i C_i = VX. \)
fun: 1.6614342185826514
message: 'Optimization terminated successfully.'
nit: 5
success: True
slack: array([ 0. , 0. , 1.66143422, 1.66143422])
x: array([ nan, nan, nan])
status: 0
Maybe that's unfair. Anyway, I couldn't get it to work. - The simplex method is all about convex geometry, and we can solve this problem using this viewpoint. - The state space of the problem is a high-dimensional cube \({ (t_i) \subseteq \mathbb R^N : 0 \leq t_i \leq T }\). We then project this cube down onto the plane by mapping to \( \sum_i t_iR_i, \sum_i t_iR_iC_i). \) As this is a linear map, we end up with the convex set (in the upper right half-plane). - We then want to know if the point \((V,VX)\) is in the set. Or, rather, we project the unit cube (i.e. with \(T=1\) above) and then want the minimal scaling we need so that \((V,VX)\) is in our convex set. Equivalently, intersect the line from the original through \((V,VX)\) with the convex set and find if it only intersects at the origin (problem is impossible) or where the line first exits the set. - To specify a convex set, we can specify the extreme points. But our cube has \(2^N\) extreme points! However, we'll make use of the fact that once we've projected onto the plane, very few of these points will remain extreme points. - The extreme points of the unit cube are where the coordinates are all 0 or 1. We can iteratively construct these by taking the set where only the first \(k\) coordinates are non-zero, and then adding new extreme points by adding on the vector \((0,\cdots,0,1,0,\cdots,0)\) where 1 is in the \((k+1)th\) position. - Again, by linearity, we can do this in the plane, as a 2D problem. So we have our extreme points, we add in this new direction each time, take the convex hull of the resulting set of points, and then run an algorithm to find the extreme points of this new convex set. What makes this work is that each step doesn't introduce too many new extreme points. - I implemented the Graham Scan algorithm. See the IPython Notebook (well, now Jupyter notebook, I guess.) - Still some numerical instability issues, which I solved by converting everything to integers; this then overflows 64-bit integers, so use the BigInteger class in C#. - In the end, it works!
The official solution is much nicer! Let's see how this works, in our notation. Choose \( t_i = T R_i' / R_i \) where \( 0 \leq R_i' \leq R_i\) the "notional flow". Then we need \[ \sum_i t_i R_i = V \Leftrightarrow T \sum_i R_i' = V \]\[ \sum_i t_i R_i C_i = VX \Leftrightarrow T \sum_i R_i' C_i = VX = TX\sum_i R_i' \Leftrightarrow \sum_i R_i'C_i = X\sum_i R_i'. \]So if we can maximise \(\sum_i R_i'\) subject to \(\sum_i R_i'(C_i-X) = 0\) then the time taken is \( T = V / \sum_i R_i'\).
The trick now is how to do this maximisation.
It's also easy to attack this directly. Start with \(R_i'=R_i\) for all \(i\) and then consider trying to make the constraint \(\sum_i R_i'(C_i-X)=0\) hold.
You can use Lagrangian methods to prove that this is optimal. Here Wikipedia fails me, but there is a concise reference here: Richard Weber's lecture notes, PDF. To put in standard form, lets \[ \text{Minimise } -\sum_i x_i \quad\text{subject to}\quad \sum_i x_iy_i=0, \quad 0 \leq x_i \leq x_i^{\max}. \]By replacing \(y=(y_i)\) with \(-y\), and reordering, we may suppose that \(\sum_i x_i^{\max}y_i \geq 0\) and \( y_1 \geq y_2 \geq \cdots \geq y_n. \)
I claim that the optimal is given by our algorithm: start with \(x_i=x_i^{\max}\), then decrease \(x_1\) to make the constraint hold, or to 0. Then use \(x_2\), and so on. This gives a solution of the form \( x^{(0)} = (0,\cdots,0,x_{i_0},x^{max}_{i_0+1},\cdots)\) with \(i_0\) minimal.
To prove this, consider the Lagrangian, \[ L(x,\lambda) = -\sum_ix_i - \lambda \sum_i x_iy_i = \sum_i x_i(-1 - \lambda y_i). \]Choose \( \lambda_0 \) with \( -1-\lambda_0y_{i_0} = 0 \) that is \( \lambda_0 = -1/y_{i_0} \). Then \( L(x,\lambda_0) = \sum_i x_i(y_i/y_{i_0} - 1))\) obtains its minimum for \(x=(x_i)\) given by \[ x_i = \begin{cases} 0 &: y_i/y_{i_0} - 1 > 0 \Leftrightarrow y_i > y_{i_0} \\ x_i^{\max} &: y_i/y_{i_0} - 1 <0 \Leftrightarrow y_i < y_{i_0} \\ \text{arbitrary} &: y_i = y_{i_0}. \end{cases}\] Notice that \(x_0\) satisfies these conditions, and so \(L(x_0,\lambda_0) \leq L(x,\lambda_0) \) for all allowed \(x\) and hence by the Lagrangian Sufficiency Theorem, \(x_0\) is an optimal solution.
Problem C, Bilingual: Given N lines of text each containing "words". We're told that the first line is in "English" and the second is in "French". The remainder are either English or French. Our task is to decide how to assign the remaining lines such that we minimise the number of words which occur in both languages. (Scare quotes as the "words" of course don't have to be real English or French words).
Turn this into a (bipartite) graph: on the left are vertices for each word, on the right are vertices for each line, and we have an edge from a word to a line if that word occurs in that line. We want to "colour" the lines English or French, with the constraint that we already know the first two lines are E and F, respectively. We want to maximise the number of words which are linked to only one of E or F.
My implementation in C# is still embarrassingly slow...
Problem D, Drum Decorator: We have a drum (so a grid of R
rows and C
columns with the far left column considered next to the far right column). You want to fill the boxes with numbers 1, 2, 3, 4, ... such that each box with K
in is next to exactly K
boxes of the same type. Here "next to" means directly up, down, left, right.
Consider a connected region of boxes with 3 in. Consider the top most row of this region, so we have
xxx 3
Where "x" means the top of the drum, or other non-3 numbers. Then the "3" we have is surrounded by exactly 3 other boxes, so these must all contain 3 as well. Inductively moving left and right, we find that the entire row must consist of 3, and the row below must also consist of 3s
xxxxxx
333333
333333
But now the row below cannot contain any 3s. Conclude: 3s can only occur in bands of height 2, and above and below must be other numbers. - Now consider 2. Such a box must have exactly 2 neighbours containing 2 as well, and this means we end up with a "path" of 2s. For example, we can have
xxxxxxx
2222222
xxxxxxx
Again, "x" means the top/bottom of the drum, or non-2 boxes. - A 1 must neighbour exactly one other 1. Suppose in a row we have "11", so to the left and right must be 2s, i.e. we get "2112" - Above could be the top, or 3s. As the 2s must have 2 neighbours, we must have
xxxxxxxx Then below the 1s => xxxxxxxx
221122 221122
2 2 2222
Then the bottom 2s are happy, so we can fill in with 1s
xxxxxxxx
?221122?
?122221?
Note that we can just repeat this forever. Furthermore, this is forced on us, as you can verify we're forced into expanding the pattern thus:
xxxxxxxx xxxxxxxxxx xxxxxxxxxxxx
22211222 ==> 2222112222 ==> 122221122221 And so on
11222211 2112222112 221122221122
Conclude: If the number of columns is a multiple of 6, we can fill 2 rows with this pattern; above and below must be 3s. - Below could be the bottom, or 3s. This is just the flipped version of the above, and notice that these give the same pattern, up to rotation. - Now consider the case of 1 stacked vertically above another 1, again in the top row (with 3s, or the top of the drum, above). Filling in 2s around we start with:
212
212
x
The x is either a 2 or a 3 (and if the latter, we have a row of 3s of course). Then from the top row we must have further 2s:
22122
212
If the x is a 3 then we must also have
22122
22122
and then we find the pattern:
221221221...
221221221...
which can be repeated forever, so long as the number of rows is a multiple of 3.
- If not then the x
is a 2, and at least one box to the left or right must also be a 2, say to the left, so we get
22122
1212
122
We now see a possible pattern:
22122212..
12121212..
12221222..
Conclude: If the number of columns is a multiple of 4, then we can get a section of height 3 rows. - If not, then we must have
22122 222122 1222122 21222122
1212 ==> 21212 ==> 121212 ==> 2121212
1221 21221 21221 221221
212 212 1212
Which (eventually!) is a contradiction as the bottom left 2 cannot have 2 neighbours which are also 2. - We have now looked at all possibilities for what the "top" row can be if it consists of 1s and 2s: - A row of 2s - Height of 2 rows, a repeating pattern of width 3 (call this Type 1) - Height of 2 rows, a repeating pattern of width 6 (call this Type 2) - Height of 3 rows, a repeating pattern of width 4 (call this Type 3)
a
blocks of type 1 we have 3**(a-1)
choices for rotation (the first one can be fixed to be not rotated, and then we have a free choice for the others). Similarly for types 2 and 3. Finally we have to decide on the number of ways to rotate all the type 1s against all the type 2s etc.