tl;dr: I contributed to a crowdfunded project to make a gadget for
capturing *digital* images using an old-fashioned 35 mm film
camera. The thing finally arrived and I tried it. I’m disappointed.

The long version is, well, long, so cut.

tl;dr: I contributed to a crowdfunded project to make a gadget for
capturing *digital* images using an old-fashioned 35 mm film
camera. The thing finally arrived and I tried it. I’m disappointed.

The long version is, well, long, so cut.

Continue reading A review of I'm Back 35.

I have an extensive collection of DVDs. They take up three bookcases, which is a bit annoying, because I have books which I’d like to put on them instead. So I’ve been copying the DVDs onto computer storage. This has taught me surprising things.

Continue reading On copying DVDs.

I tend to follow a standard idiom when writing C functions
which allocate resources. At the top, I define a variable
for each resource, and initialize it to a sentinel value
meaning that it’s not yet active, and another for the
function’s return value. At the bottom, I define a label
`end`

, where I release all of the resources which still seem
live, and return the right result. If something goes wrong
partway through, I set the return value to indicate the
problem, and `goto end`

. Easy.

But it means that the code to release a thing isn’t near where it’s defined or allocated.

I had a brainwave the other day, and wrote a macro.

```
#define FINALLY(body) \
__extension__ __inline__ TMP(finally_fn) \
(const int __attribute__((unused)) *hunoz) { body } \
int __attribute__((unused, cleanup(TMP(finally_fn)))) \
TMP(finally_var)
```

The `TMP`

macro decorates an identifier so that it’s (more or
less) private to the macro invocation. Now I can say
something like

```
void *p = 0; FINALLY({ free(p); });
```

and I can use `return`

as usual for an early exit. Yay.
(Apparently I’m not the only person who’s thought of this.)

This macro has two problems.

Firstly, it uses nested functions, so it only works with GCC. Clang doesn’t support these, presumably because there’s no good way to implement a pointer to a nested function.

Secondly, the syntax is ugly, involving parentheses and a final semicolon.

I can’t figure out a way to solve *both* problems at the same
time.

Continue reading Finally!.

I’ve just read yet another article showing how to calculate elements of the Fibonacci sequence in some functional language. I’ll give examples in Scala, though I’m not doing anything very clever linguistically, so they shouldn’t be hard for anyone. Of course, it showed the hopelessly inefficient dual recursion

```
def terrible_fib(n: BigInt): BigInt =
if (n < 0) sys.error("not defined for -ve n")
else if (n == 0) 0
else if (n == 1) 1
else terrible_fib(n - 1) + terrible_fib(n - 2)
```

and a tail-recursive version

```
def tail_fib0(n: BigInt, a: BigInt, b: BigInt): BigInt =
if (n == 0) a
else tail_fib0(n - 1, a + b, a)
def tail_fib(n: BigInt): BigInt =
if (n < 0) sys.error("not defined for -ve n")
else tail_fib0(n, 0, 1)
```

There was also an explicitly iterative version, which isn’t
interestingly different from the tail-recursive version, so
I shan’t bother showing it. Unfortunately, the tail-recursive
version is still rather slow. On my (very old) laptop, it takes
nearly a minute to calculate `tail_fib(1000000)`

. We can do
*much* better.

Continue reading How to really calculate Fibonacci numbers.

Oh. Oh my.

I have a small music collection: about 8000 files, and growing steadily. I want to keep a copy of my collection on my Android phone. (I’ve been keeping a full copy of my collection on some portable device for fifteen years now, and I don’t see why I should stop now.)

My collection is too large to fit into my phone’s `internal storage’, which appears to be whatever change is left out of 32GB by the phone’s firmware. So I bought a 64GB µSD card to keep the music on.

I initially made the copy by physically plugging my µSD
card into my laptop’s MMC slot, mounting it, and running
**rsync**(1).
**rsync** is awesome.

Then I bought a new CD. (These are silvery discs which
store uncompressed digital audio.) As usual, I copied
the audio data off the CD, encoded it as FLAC, tagged it
using MusicBrainz
**Picard**,
entered into my master archive, and then invoked my audio
conversion
**gremlin**
to make Ogg Vorbis for satellite devices. Now all I need
to do is copy the new Ogg Vorbis files to the phone.

And that’s where the trouble started.

Continue reading Synchronizing music with an Android device.

Once upon a time, the cool kids used $w$-NAF for their elliptic curve scalar multiplication. Well, the cool kids who were implementing elliptic curve cryptography anyway.

The basic idea is simple enough. You want to multiply some given point $Q$ by a scalar $n$. So you pick a window width $w$, and split $n$ into a sum of signed $w$-bit pieces \[ n = \sum_i n_i 2^i \] with $-2^{w-1} \le n_i < 2^{w-1}$ for each $i$, and with at most one nonzero $n_i$ in any group of $w$. Then you walk through these, in descending order of $i$, multiplying an accumulator by appropriate small multiples of $Q$ plucked from a table and doubling.

Then we found out that actually this was a mostly terrible idea because it broadcasts $n$ through every side-channel available. The new cool idea is to convert $n$ to a signed-digit representation, so \[ n = \sum_i n_i 2^{wi} \] again, with $-2^{w-1} \le n_i < 2^{w-1}$. Now the digits are in predictable places, and all you need to avoid leakage is to do a constant-time table lookup.

This is annoying. And a waste of time and effort. There’s a unique
representation of $n$ subject to these constraints. It’s possible to do
the signed-digit recoding from right to left in constant time,
propagating loans (like borrows in subtraction, but in the other
direction) as you go. But you want to do the scalar multiplication from
left to right so that you can use your table of small multiples. And,
while you *can* do the signed-digit recoding left-to-right, you might
need to look arbitrarily far ahead before you can commit to a digit.
(If you see a long sequence of $2^{w-1} - 1$ digits, then you’ll have to
inspect the digit *after* that to tell you whether they all lend one
leftwards or not.) I suppose you *could* do this in constantly
*quadratic* time, but I don’t think anyone would be impressed by that.

But you can have your cake and eat it too, as long as you give up on the unique representation. Relax the condition on $n_i$ to permit $2^{w-1} \le n_i \le 2^{w-1}$ and all is marvellous.

You can calculate an acceptable recoding in one pass, left to right, with one digit lookahead, and therefore in constant time.

You can do this in the same pass as the main scalar multiplication, with no additional storage.

You can work with the same size table of small multiples of $Q$. You already had to cope with the possibility of $n_i = -2^{w-1}$, and dealing with $n_i = \pm 2^{w-1}$ is no harder.

So, suppose you have a current digit $t$, and a lookahead digit $u$.
(You start out with $t = 0$ and $u$ is the leftmost digit of $n$ in its
standard radix-$2^w$ representation.) If $u \ge 2^{w-1}$, then set
$t \gets t + 1$ and $u \gets u - 2^w$ (a loan leftwards); otherwise
$u < 2^{2-1}$ and you do nothing. You can now commit to $t$: add $t Q$
to your accumulator or whatever. And finally, move along by setting
$t \gets u$ and fetching the next input digit into $u$. At this point,
you have $-2^{w-1} \le t < 2^{w-1}$, but you *might* end up adding one
to it when you look at $u$; hence the bound above.

One of the neat things that makes Ed25519 such a fast signature scheme is a clever algorithm for point decompression.

You’re given the point’s $y$-coordinate, and have to calculate $x = \pm\sqrt{(y^2 - 1)/(d y^2 + 1)}$. The naïve approach is to observe that inversion takes a modexp, and square root takes a modexp, so we need two modexps. The Ed25519 authors noticed that you can do both jobs with one modexp. There are two problems which need cleverness: one is how to combine the division and square root calculations; and the other is how to actually calculate a square root in $\mathbb{F}_{2^{255}-19}$ in the first place.

The latter problem is that $p = 2^{255} - 19 \equiv 5 \pmod{8}$, so you can’t calculate square roots directly. Their clever solution, which I can’t improve on, is as follows. Suppose we’re given some $v$, which we know (or are willing to assume) is a square. Therefore, $v$ has order $(p - 1)/2$. The trick is to notice that, if we let $w = v^{(p+3)/8}$, then $w^4 = v^{(p+3)/2} = v^{(p-1)/2} v^2 = v^2$. Therefore $w^2 = \pm v$. Conveniently $-1$ is a square in this field, so we can use $w$ as our square root of $v$ if $w^2 = v$, or pick $i w$ if $w^2 = -v$ (where $i = \sqrt{-1}$).

The paper suggests that, given the problem of calculating a square root of $x/y$, to compute $w = x y^3 (x y^7)^{(p-5)/8}$. This I can make better.

Instead, I suggest calculating $v = x y$, and $w = x v^{3(p-5)/8+1}$. Then $w^4 = x^4 v^{3(p-5)/2+4} = x^4 v^{(3p-7)/2} = x^4 v^{3(p-1)/2-2} = x^4 v^{-2} (v^{(p-1)/2})^3 = x^4/v^2 = x^4/(x^2 y^2) = x^2/y^2$. Phew!

Of course, I didn’t figure it out in that direction. Indeed, I started
out with a different problem, in a different field. Specifically, I
wanted to decompress a point on the Ed448-Goldilocks curve, over the
field $\mathbb{F}_{\phi^2-\phi-1}$, where $\phi = 2^{224}$. Helpfully,
$q = \phi^2 - \phi - 1 \equiv 3 \pmod{4}$ because everything about this
prime is *awesome*, so you can compute square roots in $\mathbb{F}_q$ as
$\sqrt{v} = w = v^{(q+1)/4}$, since $w^2 = v^{(q+1)/2} = v^{(q-1)/2} v =
v$. Which leaves the problem of combining this with division.

I found myself reasoning by analogy with Montgomery’s inversion trick.
(If you want to compute both $x^{-1}$ and $y^{-1}$, then let $z = x y$,
and compute $x^{-1} = y z^{-1}$ and similarly for $y$.) If I can find a
square root $w’$ of $v = 1/x y$, then $w = x w’$ will be a square root
of $x/y$. This replaces the problem of *division* with the problem of
*inversion*. At this point, it’s sort of easy. I’ll go back to the
Ed25519 field, and leave the Goldilocks version as an exercise. (Or you
can cheat and peek at the Catacomb source code. RFC8032 is no help: it
gives an unnecessarily complicated version, similar to the Ed25519
formula above.)

I’ll assume $v$ has order $(p - 1)/2$, so to invert I need to raise to the power $(p - 3)/2$ to find $1/v$. Then I raise to the power $(p + 3)/8$ to find the maybe-square-root of $1/v$ which might be off by a factor of $i$. But if I do these one after the other, then I can just multiply them modulo $(p - 1)/2$. That’s going to be a nuisance because of the divisions. But wait! These are all whole numbers. How do I know this? Because $p \equiv 5 \pmod{8}$. So let’s actually write $p = 8 t + 5$. Then the group of quadratic residues has order $4 t + 2$. To invert, I raise to the power $4 t + 1$; and to find the maybe-square-root, I raise to the power $t + 1$. Altogether, then, I want $m = (4 t + 1) (t + 1) = 4 t^2 + 5 t + 1 = t (4 t + 2) + 3 t + 1 \equiv 3 t + 1 = 3 (p - 5)/8 + 1 \pmod{4 t + 2}$. And I’m done.

[**Edited 2017-09-20:** Fix a typo in the description of Montgomery’s
inversion trick.]

I think the breakage is more or less over, so it’s time to (a) enjoy the results, and (b) write up the war stories. I’m now running a mostly-IPv6 network.

Actually, the network was mostly-IPv6 to begin with. The main
missing piece was DNS: I wasn’t advertising AAAA records for my
hosts, even though they nearly all actually had IPv6 addresses.
Getting things to actually *use* IPv6, then, mostly involved hacking
on my
unncesssarily complicated zone file generator
(written in Common Lisp, natch).

At this point, everything went wrong. It turns out that if I breathe wrong, then Linux loses the routing table entries for its own local interface addresses. It turns out that I’d breathed wrong quite a lot before pulling the big DNS switch. I still don’t know what causes this; my current theory is that it’s BIRD, but I don’t have any convincing evidence (and I know that the obvious alternative — Quagga — is hopeless in my environment. I just have a cron job which checks that the interface routes haven’t vanished.

Then I found out that SSH connections from my laptop **crybaby** to a
colocated (virtual) server **stratocaster** were wedging in a not-quite
MTU-blackhole way when the server had a lot of stuff to say.
Specifically, the first few large segments would get through, and then
they’d dry up. (MTU problems on IPv6 are especially nasty, because IPv6
routers don’t fragment packets — only the sender is allowed to do
that.)

The MTU bottleneck is the VPN hop between the colocated server (the
endpoint is on **precision**) and the house (on **radius**). Capturing
packets showed that I was right: **stratocaster** was sending a TCP
segment that was too large, and being told this by **precision**. So
**stratocaster** sent the data again, in two pieces — and then sent the
next segment full size. After a while, **precision** gave up sending
packet-too-big ICMP messages (some rate-limiting thing, presumably), and
**stratocaster** continued throwing large TCP segments into the void,
having forgotten the path MTU, and the connection jammed up.

I’ve never trusted the standard wireless security stuff since the old
WEP disaster, so even when I’m at home, traffic between **crybaby** and
the home servers goes over the VPN. To make all of this work
efficiently, mobile devices like **crybaby** use a simple heuristic to
decide which static VPN endpoint is likely to be best to associate with
(**radius** in the house, otherwise **precision**), and that endpoint
advertises a host route for the mobile device through my dynamic routing
machinery.

Unfortunately, Linux has a bug, and won’t attach path MTU information to host routes! Simple solution: since IPv6 address space is cheap, allocate a little network to each mobile device. Rather than 2001:ba8:1d9:6000::{1,2,3}, we’ll use 2001:ba8:1d9:6000::{1,2,3}:0/112.

Yes, I realise that allocating …:0 to the mobile device was brave.
Too brave: it didn’t work. In particular, now connections wedged in the
other direction. I checked: this time nobody was actually sending
packet-too-big messages. Eventually, I noticed that **radius**’s kernel
was logging plaintive — if gnomic — messages of the form

```
2014-04-19T14:05:29+01:00 radius kernel: [363728.436583] icmpv6_send: acast source
```

Tracking down this message didn’t take too long. Apparently, it means
that the sender (to whom we’d be sending an ICMPv6 error) has been
identified as an anycast address — and for some reason we shouldn’t
actually send it at all. It’s true that …:0/112 is defined to be the
any-router anycast address, and **radius** could see that it was indeed
a /112 network, so I can’t argue with that; but I think refusing to send
ICMP back to an anycast address is rather poor.

Anyway, another renumbering ensued, and now the mobile devices (and my actual IPv6 anycast services) are on …:1/112. And IPv6 works.

SSL and TLS are broken in many exciting ways. I’m just going to focus on one of these: revocation.

Certificate authorities (‘CAs’) tend to issue certificates with fairly long validity periods — a year is common. What do they do if they find that a certificate is bad before the year is up?

Continue reading OCSP stapling, and other stupid ideas.

None of these are really very clever, but they’re slightly creative abuses of the standard tools, and maybe someone will find them useful.

**Browsing lots of small files.**Suppose you’re looking at something like a Qmail configuration directory or settings in**/proc**or**/sys**. where there are lots of one- or two-line files. You could try to get a handle on them by using**cat**(1) but I find that the right answer is**grep****^***files*…For trees of stuff, like in

**/proc/sys/net/*/conf**, say, GNU**grep**’s`-r`

flag is useful.**Clobbering files as root.**I try to run as a non-root user most of the time, and elevate to root as needed, using**sudo**(8). (The vulnerability history of**sudo**makes for exciting reading, but I like that it maintains a log. I’m much less impressed with the security theatre of typing passwords, and its tendency to send loud SECURITY mail is very annoying.)So, there’s a file you want to hack: it’s owned by root, and the easiest way to hack it is with a shell rune. But if you redirect something’s stdout to the file, that’ll be done in your shell, with your privileges, so it won’t work. I use

**tee**(1) here.…

**|****sudo****tee***output***>/dev/null**(This at least leaves a log that I hacked the file, and even if it doesn’t say exactly how, it’s no less informative than

**sudoedit**.)

I’m sure there were going to be more of these. Maybe some of them will come to me later.