Blog

Sampling from a Stream

Selecting a random element from an array of length n is easy: simply generate a random integer i, with 0 <= i < n, and use the array element at that index position. But what if the length of the array is not known beforehand, or is, in fact, infinite (i.e. a stream)? And what if we don’t just want a single element, but a set of m samples, without replacement?

Random Shuffles

Shuffling a collection of items is a surprisingly frequent task in programming: essentially, it comes up whenever a known input must be processed in random order. What is more, there is a delightful, three-line algorithm to accomplish this task correctly, in-place, and in optimal time. Unfortunately, this simple three-line solution seems to be insufficiently known, leading to various, less-than-optimal ad-hoc alternatives being used in practice — but that is entirely unnecessary!

Nelder-Mead Simplex Optimization

The Nelder-Mead-Algorithm (also known as the “Simplex Algorithm” or even as the “Amoeba Algorithm”) is an algorithm for the minimization of non-linear functions in several variables. In contrast to other non-linear minimization methods, it does not require gradient information. This makes it less efficient, but also less prone to divergence problems. In contrast to other methods, it is not necessary for the minimum to be bracketed by the initial guess: the algorithm performs a limited “global” search. (It may still converge to a local, rather than the global extremum, of course.) Finally, the algorithm is fairly simple to implement as a stand-alone routine, which makes it a natural choice for multi-dimensional minimization if function evaluations are not prohibitively expensive.

How Best to Capture Output from Scientific Calculations?

When writing programs that do computations, my overwhelming preference is to simply write results to standard output, and to use shell redirection to capture the output in a file. In this way, I am leveraging the shell’s full functionality, in particular filename completion, in the most convenient way possible. For the file format itself, I prefer simple, column-oriented, delimiter-separated flat files. They are completely portable, and can be read and understood by most tools. (They also play well with the usual Unix toolset.)

But this simple approach breaks down, once a program has to write more than one output stream: for example in the case of a simulation run, I may want to capture periodic snapshots of the simulation itself, but also track various calculated metrics as well. These two streams will not fit comfortable into a single flat file. One option is to use a structured file format, the other option is to write to multiple files simultaneously.

HTTP Client and Server in Go

Among my favorite features of the Go language is the fact that Go has concurrency and network programming “natively” built in. It reflects Go as being designed for the 21st century, where the network is as much taken for granted as the filesystem was towards the end of the 20th.

As cut-n-paste examples, here are skeleton implementation of both an HTTP server and an HTTP client. These implementations are intentionally bare-bones, so as not to obscure the most relevant points. It should be easy enough to extend them as desired.

QR Codes

QR Codes

QR codes are a two-dimensional equivalent of barcodes: a graphical encoding of information, which in practice means a string of about 4000 alpha-numeric characters (upper-case only) or a little less than 3000 arbitrary bytes.

So, how then are QR codes able to perform magic, such as automatically opening web pages, or sending text messages, or even dealing bitcoin? The answer is: they can’t.

Let’s try to understand what’s going on.