tiistai 29. elokuuta 2017

Dependencies, why are they so hard to get right?

The life cycle of any programming project looks roughly like this.


We start with the plain source code. It gets processed by the build system to produce so called "build tree artifacts". These are executables, libraries and the like but they are slightly special. They are stored inside the build tree and can not usually be used directly. Every build system has its own special magic sprinkled in the outputs. The files inside a build tree can not be run directly (usually) and the file system layout can be anything. The build tree is each build system's internal implementation detail, which is usually not documented and definitely not stable. The only thing that can reliably operate on items in the build directory is the build system itself.

The final stage is the "staging directory" which usually is the system tree, as an example /usr/lib in Unix machines but can be e.g. an app bundle dir on OSX or a standalone dir that is used to generate an MSI installer package on Windows. The important step here is installation. Conceptually it scrubs all traces of the build system's internal info and make the outputs conform to the standards of the current operating system.

The different dependency types

Based on this there are three different ways to obtain dependencies.


The first and simplest one is to take the source code of your dependency, put it inside your own project and pretend it is a native part of your project. Examples of this include the SQLite amalgamation file and some header-only C++ libraries. This way of obtaining dependencies is not generally recommended or interesting so we'll ignore it for the remainder of this post.

Next we'll look into the final case. Dependencies that are installed on the system are relatively easy to use as they are guaranteed to exist before any compilation steps are undertaken and they don't change during build steps. The most important thing to note here is that these dependencies must provide their own usage information in a build system independent format that is preferably fully declarative. The most widely accepted solution here is pkg-config but there can be others, as long as it is fully build system independent.

Which leaves us the middle case: build system internal dependencies. There are many implementations of this ranging from Meson subprojects to CMake internal projects and many new languages such as D and Rust which insist on compiling all dependencies by themselves all the time. This is where things get complicated.

Since the internal state of build trees are different, it is easy to see that you can not mix two different build systems within one single build tree. Or, rather, you could but it would require one of them to be in charge and the other one to do all of the following:
  • conform to the file layout of the master project
  • conform to the file format internals of the master project (which, if you remember, are undocumented and unstable)
  • export full information about what it generates, where and how to the master project in a fully documented format
  • accept dependency information for any dependency built by the master project in a standardized format
And there's a bunch more. If you go to any build system developer and tell them to add these features to their system they will first laugh at you and tell you that it will happen absolutely never.

This is totally understandable. Pairing together the output of two wildly different unstable interfaces in a reliable way is not fun or often even possible. But it gets worse.

Lucy in the Sky with Diamond Dependency Graphs

Suppose that your dependency graph looks like this.

The main program uses two libraries libbaz and libbob. Each one of them builds with a different build system each of which has its own package manager functionality. They both depend on a common library libfoo. As an example libbob might be a language wrapper for libfoo whereas libbaz only uses it internally. It is crucially important that the combined project has one, and only one, copy of libfoo and it must be shared by both dependents. Duplicate dependencies lead, at best, into link time errors and at worst to ten hour debugging sessions of madness in production.

The question then becomes: who should build libfoo? If it is provided as a system dependency this is not an issue but for build tree dependencies things break horribly. Each package manager will most likely insist on compiling all their own dependencies (in their own special format) and plain refuse to work anything else. What if we want the main program to build libfoo instead (as it is the one in charge)? This quagmire is the main reason why certain language advocates' view of "just call into our build tool [which does not support any way of injecting external dependency information] from your build tool and things will work" ultimately unworkable.

What have we learned?

  1. Everything is terrible and broken.
  2. Every project must provide a completely build system agnostic way of declaring how it is to be used when it is provided as a system dependency.
  3. Every build system must support reading said dependency information.
  4. Mixing multiple build systems in a single build directory is madness.

lauantai 19. elokuuta 2017

Apple laptops have become garbage

When OSX launched it quite quickly attracted a lot of Linux users and developers. There were three main reason for this:

  1. Everything worked out of the box
  2. The hardware was great, even sexy
  3. It was a full Unix laptop
It is interesting, then, that none of these things really hold true any more.

Everything works out of the box

I have an Android phone. One of the things one would like to do with it is to take pictures and then transfer them to a computer. On Linux and Windows this is straightforward: you plug in the USB cable, select "share pictures" on the phone and the operating system pops up a file dialog. Very simple.

In OSX this does not work. Because Android is a competitor to the iPhone (which makes Apple most of its money nowadays) it is in Apple's business interest to not work together with competing products. They have actively and purposefully chosen to make things worse for you, the paying customer, for their own gain. Google provides a file transfer helper application but since it is not hooked inside the OS its UX is not very good.

But let's say you personally don't care about that. Maybe you are a fully satisfied iPhone user. Very well, let's look at something completely different: external monitors. In this year's Europython conference introductory presentation the speaker took the time to explicitly say that if anyone presenting had a latest model Macbook Pro, it would not work with the venue's projectors. Things have really turned on their heads because up to a few years ago Macs were pretty much the only laptops that always worked.

This problem is not limited to projectors. At home I have an HP monitor that has been connected to many a different video source and it has worked flawlessly. The only exception is the new work laptop. Connecting it to this monitor makes the system go completely wonky. On every connection it does an impressive impersonation of the dance floor of a german gay bar with colors flickering, things switching on and off and changing size for about ten seconds or so. Then it works. Until the screen saver kicks in and the whole cycle repeats.

If this was not enough every now and then the terminal application crashes. It just goes completely blank and does not respond to anything. This is a fairly impressive feat for an application that reached feature stability in 1993 or thereabouts.

Great hardware

One of the things I do in my day job is mobile app development (specifically Android). This means connecting external display, mouse and keyboard to the work laptop. Since macs have only two USB ports they are already fully taken and there is nowhere to plug the development phone. The choices here are to either unplug the mouse whenever you need to deploy or debug on the device or use a USB hub.

Using dongles for connectivity is annoying but at least with a hub one can get things working. Except no. I have a nice USB hub that I have used for many years on many devices that works like a charm. Except on this work computer. Connecting anything through that hub causes something to break so the keyboard stops working every two minutes. The only solution is to unplug the hub and then replug it again. Or, more specifically, not to use the hub but instead live without an external mouse. This is even more ridiculous when you consider that Apple was the main pioneer for driving USB adoption back in the day.

Newer laptop models are even worse. They have only USB-C connectors and each consecutive model seems to have fewer and fewer of them. Maybe their eventual goal is to have a laptop with no external connection slots, not even a battery charger port. The machine would ship from the factory pre-charged and once the juice runs out (with up to 10 hours of battery life™) you have to throw it away and buy a new one. It would make for good business.

After the introduction of the Retina display (which is awesome) the only notable hardware innovation has been the emojibar. It took the concept of function buttons and made it worse.

Full Unix support

When OSX launched it was a great Unix platform. It still is pretty much the same it was then, but by modern standards it is ridiculously outdated. There is no Python 3 out of the box, and Python 2 is several versions behind the latest upstream release. Other tools are even worse. Perl is 5.18 from 2014 or so, Bash is 3.2 with the copyright year of 2007, Emacs from 2014 and Vim from 2013. This is annoying even for people who don't use macs, but just maintain software that supports OSX. Having to maintain compatibility with these sorts of stone age tools is not fun.

What is causing this dip in quality?

There are many things one could say about the current state of affairs. However there is already someone who has put it into words much more eloquently than any of us ever could. Take it away, Steve:

Post scriptum

Yes, this blog post was written on a Macbook, but it is one of the older models which were still good. I personally need to maintain a piece of software that has native support for OSX so I'm probably going to keep on using it for the foreseeable future. That being said if someone starts selling a laptop with a Risc-V processor, a retina-level display and a matte screen, I'm probably going to be first in line to get one.

maanantai 7. elokuuta 2017

Reconstructing old game PC speaker music

Back when dinosaurs walked the earth regular PCs did not have sound cards by default. Instead they had a small piezoelectric speaker that could only produce simple beeps. The sound had a distinctive feel and was described with words such as "ear-piercing", "horrible" and "SHUT DOWN THAT INFERNAL RACKET THIS INSTANT OR SO HELP ME GOD".

The biggest limitation of the sound system was that it could only play one constant tone at a time. This is roughly equivalent to playing the piano with one finger and only pressing one key at a time. Which meant that the music in games of the era had to be simple. (Demoscene people could do crazy things with this hardware but it's not relevant for this post so we'll ignore it.)

An interesting challenge, then, is whether you could take a recording of game music of that era, automatically detect the notes that were played, reconstruct the melody and play it back with modern audio devices. It seems like a fairly simple problem and indeed there are ready made solutions for detecting the loudest note in a given block of audio data. This works fairly well but has one major problem. Music changes from one note to another seamlessly and if you just chop the audio into constant sized blocks, you get blocks with two different consecutive notes in them. This confuses pitch detectors. In order to split the sound into single note blocks you'd need to know the length of each note and you can't determine that unless you have detected the pitches.

This circular problem could probably be solved with some sort of an incremental refinement search or having a detector for blocks with note changes. We're not going to do that. Let's look at the actual waveform instead.
This shows that the original signal consists of square waves, which makes this specific pitch detector a lot simpler to write. All we need to do is to detect when the signal transitions between the "up" and "down" values. This is called a zero-crossing detector. When we add the duration of one "up" and the following "down" segment we have the duration of one full duty cycle. The frequency being played is the inverse of this value.

With this algorithm we can get an almost cycle-accurate reconstruction of the original sound data. The problem is that it takes a lot of space so we need to merge consecutive cycles if they are close enough to each other. This requires a bit of tolerance and guesswork since the original analog components were not of the highest quality so they have noticeable jitter in note lengths. With some polishes and postprocessing you get an end result that goes something like this. Enjoy.