Sam Boyer

Tag1 Quo: Finding a Place in the (Version) Universe, Part 2

When we left off last time, we’d assembled a definition of what versions are. Now, we’re going to dive into how we use them in Tag1 Quo: comparing them to one another!

The general goal is straightforward enough: we want to know if, say, 6.x-1.0 is less than 6.x-1.1. (Yup!) Or if 6.x-1.0-alpha1 is less than 6.x-1.0. (Also yup!) Let’s rewrite these two examples as tuple comparisons:

{6,1,0,4,0,0} < {6,1,1,4,0,0} = TRUE
{6,1,0,0,0,0} < {6,1,1,0,0,0} = TRUE

To determine if one tuple is less than the other, we proceed pairwise through the tuple’s values, comparing the integers at the same position from each, until we find different values. Whichever tuple’s value at that position is less is considered to be the lesser version. (Uniformity in this comparison operation is why the mapping for prerelease types assigns unstable to 0, rather than 4.)

However, this simple comparison operation doesn’t actually meet Quo’s requirements. Remember, Quo’s crucial question is not whether there are any newer versions, but whether there are newer security releases that are likely to apply to the version we’re investigating.

So, say we’re looking at 6.x-1.1 for a given extension, and there exists a 7.x-2.2 that’s a security release. While the latter is obviously less than the former:

{6,1,1,4,0,0} < {7,2,2,4,0,0} = TRUE

We don’t care, because these releases are on totally different lines of development.

...right? I mean, it’s probably true that whatever security hole existed in 7.x-2.1 doesn’t exist in 6.x-1.1. Maybe? Sort of. Certainly, you can't upgrade to 7.x-2.1 directly from 6.x-1.1, as that's changing major versions. But Quo came to be as part of the D6LTS promise - that IF there are security holes in later versions, we'll backport them to 6.x - so it's certainly possible that the problem might still exist. It all depends on what you take these version numbers to mean.

Yeah, we need to take a detour.

Versions are meaningless

As you become accustomed to a version numbering scheme - Drupal, semver, rpm, whatever - the meanings of the version components gradually work their way to the back of your mind. You don’t really “read” versions, so much as “scan and decode” them, according to these osmosed semantics. This peculiar infection of our subconscious makes it far too easy to forget a simple fact:

Version numbers have absolutely no intrinsic meaning. They have no necessary relationship to the code they describe.

Maybe this is obvious. Maybe it isn’t. If not, consider: what would prevent you from writing a module for Drupal 7 APIs, but then tagging and releasing it as 8.x-1.0? Or, for that matter, writing a module with no functions, but prints “spork” on inclusion of its .module file? (Answer: nothing.) Also, Donald Knuth uses a π-based numbering system for TeX’s versions, adding one more digit with each successive release. The version looks like a number, but the only property that matters is its length. Versions are weird.

This nebulous relationship is both the blessing and curse of versions. The curse is obvious: we can’t actually know anything with certainty about code just by looking at, or comparing, version numbers. But the blessing is more subtle: a well-designed version numbering system provides a framework for consistently encoding all of our intended semantics, together. Both of those words have specific meaning here:

  • “Together,” as in, it combines all the different aspects of changes to code that are important for Quo’s purposes: independent lines of development, Drupal core version compatibility, D6LTS’ own patch addenda, etc.

  • “Consistent,” as in, a numerical coördinate system - rather than an ad-hoc collection of flags, strings, and numbers - is a formal mathematical system without weird, nasty combinations of states.

The blessing outweighs the curse because, even if versions may lie to us about what the code actually is, they provide a formal structure in which it’s easy to understand what it should be. And, in the wild west of organic open source software growth, knowing with certainty about what things should be is a pretty good goal. It makes tasks concrete enough that you can actually build a business and product - like Tag1 Quo! Which takes us back to the main road after our detour - what’s the answer to this question?

{6,1,1,4,0,0} < {7,1,2,4,0,0}

The strictly mathematical answer is “yes.” But, for the question we’re actually interested in. we generally assume that security releases are only necessary when they’re on both the same core version, in the same line of development (major version). So, we say “no” here. And we’d also say “no” if the core version were the same:

{6,1,1,4,0,0} < {6,1,2,4,0,0}

This one is a little iffier, though. While porting from one Drupal core version to the next almost always involves a significant rewrite, that’s not necessarily the case for major versions. The security release may actually apply. It’s the kind of thing we need to investigate when deciding whether or not to release our D6LTS patch versions.

Today, Quo assumes security releases on different lines of development aren’t applicable to one another, but what’s important is that we know that’s an assumption. By representing the versions in a fully abstracted coördinate system as we have, rather than (for example) formally encoding assumptions about e.g. “lines of development” into the data itself, we allow ourselves the flexibility of changing those assumptions if they turn out to be wrong. Being that Quo’s entire business proposition turns on answering questions about versions correctly, it pays to build on a clear, flexible foundation.

This post rounded out the broader theory and big-picture considerations for versions in Quo. In the next post, I’ll get more into the nitty gritty - how Quo implements these ideas in a way that is practical, fast, and scalable.