03 Oct 2016
Given a population and sampling at random (“with replacement”) what’s the expected number of samples I need to see 50% (or any fixed proportion) of the population.

I deliberately ask for “expected” because calculating expectations is often easier thn getting a handle on the whole probability distribution. A trick is to exploit linearity: express the random variable of interest as a sum of random variables you can calculate the expectation of.

Sampling at random from , suppose we have seen exactly members of . As each sample is independent, letting denote the number of samples required to see a new member of , we see that
\[ \mathbb P(T_k = j) = \left(\frac{k}{\vert P\vert}\right)^{j-1} \frac{\vert P\vert-k}{\vert P\vert} \]
That is, a geometric distribution, and so . By convention, .

Then, if I want to see exactly members, the time needed is . We can estimate the expectation by an integral,
\[ \mathbb E(S_n) \approx \vert P\vert \int_{\vert P\vert-n}^{\vert P\vert+1} \frac 1 x \ dx
= \vert P\vert \log \Big( \frac{\vert P\vert+1}{\vert P\vert-n} \Big) \]

Hence, in answer to our original question, we need on average samples to see half of .

29 Sep 2016
This post, very tangentially, relates to a quiz we set job candidates. If you are applying for a job at my current company, and somehow work out I work there, and find this, then you probably half deserve a job anyway.

Suppose you have a population \( P \) and some test as to whether a member of the population is good or bad. We want to find a “random good member”. There are two methods that come to mind:

- Random sampling: pick a member at random, test it, if it passes return it, otherwise try again.
- Randomly order \(P\) and work through the whole set, returning the first good member.

The first method has the virtue of being simple. The second method uses a lot of memory, if \(P\) is large. But on closer thought, what if the proportion of “good” members is rather small. The 2nd method is guaranteed to find a good member in . How slow can the first method be?

Let \(B\subseteq P\) be the set of bad members. The first method fails to find a good member in \(n\) terms with probability \[ \left( \frac{\vert B\vert }{\vert P\vert} \right)^n \]
(The chance of repeatedly choosing a bad member).

- So, if half your members are bad, then you need just 7 goes to be 99% sure of finding a good member.
- If only 1% of your population is good then you need 459 trials to be 99% sure of find a good member.

For the 2nd method, by “random ordering” I mean picking a member of the symmetric group of order at random and applying it to the set. We can do this in time proportional to . The algorithm is simple: choose an unpicked member of at random and take it as the next member of the random ordering, and then repeat. How long does it take to find a good member? The chance of choosing only bad members for the first goes is
\[ \frac{\vert B \vert (\vert B\vert -1) (\vert B\vert-2) \cdots (\vert B\vert - n+1)}
{\vert P \vert (\vert P\vert -1) (\vert P\vert-2) \cdots (\vert P\vert - n+1)} \]

- So this will be quicker than method one, always.
- But as becomes large, the limit is the same.

I’m not sure I’ve come to any conclusion. Method 1 is simple and fast if the good population is not too small. Method 2 needs some storage space, but is more predictable if is not too large and the proportion of good members is very small. If is very large and the proportion of good members very small, you probably need a better idea than simple sampling.

A more mathematical question presents itself. Suppose we do away with the good and bad members and simply ask:

Given a population and sampling at random (“with replacement”) what’s the expected number of samples I need to see 50% (or any fixed proportion) of the population.

07 Sep 2015
While spending a long weekend in Windsor, we found the following:

The instructions read, in part:

The puzzle challenge is to solve this unique one-way maze by travelling from the entrance Pawn to the central Castle – always going forwards.

It’s not quite clear to me what this means, but there is a good quote from Here:

The maze should be played as if you are a runaway train - always moving smoothly through the points forwards, and never able to reverse. This maze-movement idea was inspired by the nearby Windsor Railway Station, which brought the Royal Family to Windsor from Victorian times onwards. The maze paths sometimes run together in pairs, like a pair of railway tracks; sometimes they even run through “railway cuttings”, where the grass surface rises and falls around them.

I wonder if there is an elegant way to convert this to a directed graph problem? I couldn’t see how to do this cleanly, as in the problem you must pass through a “node”, not reverse direction.

Anyway, here is a solution. From the pawn you initially get yourself into an outer “loop”, from which only the King is accessible. As we can only visit the King once, we don’t want to get back onto the outer loop in the same direction.

- So, we take the loop which the King is on, and reverse direction. Two choices here.
- We can then either go to the Bishop:
- In we pass out by the “Raised Banks” then we’re back on the outer loop, which is no good.
- So we must reverse direction again, and head in the direction of the Knight.
- We can skip past the knight back onto the outer loop, but that’s wrong.
- So visit the Knight.
- Then take the branch down to the Queen, and then to the Castle. Task complete.

- Or we loop back down towards the Pawn, and either:
- Head towards the Knight, but we can’t visit the Knight, so we end up back up near the Bishop, and we’ve just taken a loop.
- Take the earlier branch, and head up the Castle, but following the track, we’re forced to visit the Queen, and then head to the Castle. But we’ve missed pieces.

So, the only solution is King, Bishop, Knight, Queen, Castle. And once we’ve heading to the Knight, we only have a choice of which direction to pass through the Knight “station” in, everything else is determined. Before that we can loop about a bit if we wish.

30 Jun 2015
More a note to a future self than aything else… In C++ we have the notion of Resource Acquisition Is Initialization which I think I have internalised as the following:

Any “resource” (memory, file, handle etc.) should be “owned” by an object. An object’s constructor should allocate the resource, and the destructor should free the resource (with some appropriate choices then made for what copy/move constructors/assignments should do). By use of C++11 `unique_ptr`

and `shared_ptr`

this model can be extended to pointers.

Notice I didn’t mention “exception”, though one major additional use of RAII is exception safety: if an exception is thrown, then all objects currently in scope will have their destructors run, and so any resources will be safely released.

In C#/Java we have garbage collection. This takes care of memory, but not any other “resource”. In C# there is the `IDisposable`

interface, and I was for a while confused by what this did. I am not alone, judging from SO:

For me, the following is a good way to think:

- Imagine going back to programming in C. You allocate memory with
`malloc`

and then free it later with `free`

. Similarly, if you open a file, you need to later close it.
- A managed language (let’s say C#) removes the need to
`free`

anything. But it does *nothing* for the file open/close issue.
- So we must manually
`close()`

a file, say.
- How then to handle exceptions? Use the
`try ... catch ... finally`

idiom!
- Then the
`IDisposable`

interface in C# should be thought of as enabling the syntactic sugar of the `using`

command. Similarly the `AutoCloseable`

interface in Java 7.

What of destructors? Don’t use them: they are only of use if you need to free “unmanaged resources”, and if you need to do that, you’ll know it.

So why the “IDisposable pattern”? This seems really to exist merely to support finalizers (aka destructors). However, if your class is not sealed, then maybe a derived class will need a destructor. The pattern then exists to ensure unmanaged resources are always freed, but from the destructor no managed resource is freed (as they might have already been freed by the garbage collector).

All in all, that seems about as much work as writing a destructor in C++!

23 Jun 2015
BF is an amusingly named, impractical, Turing complete programming language. It is very like an actual Turing machine: you have an infinite (originally, one-sided, 30K long) array of (8-bit) values, with commands to increase/decrease (+/-) the cell, or move left/right (</>). Input (,) and output (.) are handled. Loops are given by [] and implement a “while cell is non-zero” loop. So a trivial mapping to C is possible (see the Wikipedia article).

I thought it would be fun to give an object oriented implementation, for which I used C++ for a change. My implementation decouples the parsing of the programme from running it. The parse class reads the input in a single pass, using a stack for dealing with nested loops. It translates the input into a list of commands, represented by an abstract base class with an `execute`

method, overloaded for the various different commands. This is the “command” pattern (closely related to the “strategy” pattern).

I then have a separate class to represent the `machine`

which can run the commands. This is again an abstract base class, which leaves input and output to be implemented in a derived class. The commands act on the machine (I think you could consider this an example of “dependency inversion principle”: both the commands and the machine act on abstractions, so there is `loose coupling`

between them).

Currently I just have one concrete machine, which uses `vectors`

to store input and output. As an example of the loose coupling, there are two parsers and associated command classes, which can work with the same machine. The first parser does no translation (and, internally, I use `unique_ptrs`

a lot to avoid memory management). The second parser coalesces multiple commands (so, for example, “++++” becomes “+ times 4” internally) and uses RAII to handle memory management.

The parser and the machine need to communicate the list of instructions. This is mediated by an abstract base class which functions like a specialised `vector`

. The concrete implementation actually uses a `vector`

of (smart) pointers but one could also implement this as a raw interpreter with no parse stage, I guess. To avoid copying the list, I used `shared_ptr`

aka reference counting. A subtle point here is that in the constructor, we shouldn’t have a naked pointer because we may throw an exception if the input is not well-formed. The `unique_ptr`

`release()`

method comes in handy here to pass off ownership to a `shared_ptr`

.

On GitHub

**Update:** To be correct, in the 2nd case, we need to `delete`

the copy constructor and assignment operators (as the object “owns” the allocated memory, a copy should not be allowed, unless we make a deep copy, which we have no need for.) I actually rather like the philosophy of “Role of Zero”: use automatic containers and `unique_ptr`

or `shared_ptr`

to manage *all* members. These then automatically provide copy/move operators, or disallow them (in the case of `unique_ptr`

). (As ever, try this with g++, and you get an *incomprehensible* error message though!)

The article goes on to talk about how we can in fact *avoid* the need for virtual destructors if we consistently use shared_ptr, because a side-effect of storing a destructor is that `shared_ptr<base> p = make_shared<derived>()`

means that when the reference count of `p`

becomes 0, the destructor of `derived`

is (correctly) called even if the destructor is not virtual. A bit of work with Google reveals the following standards suggestion: n3974.pdf which will allow something similar with `make_unique`

.