There are many different VCS (version control systems) out there these days. Each has its own strengths and weaknesses, but they all seem to attack the problem in the wrong direction.
Today’s VCS are no longer simple versioning systems. Every self respecting system has branches support in some way or another, and they are all targeted at a group of developers. They all publish themselves as some form of content tracker, most noticeable is probably Git which shouts its “a dumb content tracker” at anyone who agrees to listen (don’t get me wrong, I like Git).
But isn’t it ironic that all systems don’t really track our real content? Let’s look at Git again for that matter. It tracks the content of files from the machine’s perspective, i.e. the actual bytes of files. There’s nothing wrong with this except it’s not the content I’m interested at. When I add a sentence to my text file in my text editor I’m NOT adding a bunch of bits to the file. I’m adding a sentence and that’s it. I shouldn’t care how my editor writes it, how the underlying OS stores it, and what kind of dark magic my VCS is going to use in order to merge it with some other guy’s sentence he added in his own branch. Bottom line is my content is the sentence I added in the context in which it was written.
So any system which tries to interpret my sentence as a bunch of bits added or removed to a file is simply wrong. And if it later tries to merge my work with another work, a successful merge is simply good luck as far as I’m concerned. It’s like I’ll try to merge two texts written in Japanese, which I can’t read, write, talk or understand. Honestly, when was the last time you tried to merge two texts you can’t read?
Now let’s take a look at how a smart system who tracks only what it understands will look like. Let’s say I’m creating an image in Photoshop with a bunch of layers and commit. Now my co-worker creates a branch with that image and both of us update it independently. I’m changing the text in layer 3 while he adds a shadow to this same layer. Both of us now save and commit, and now we want to merge. Since our system understands our content the merge is simple. The new image will have my text in layer 3 plus his shadow. Simple. But how would you do it if you track only the raw bits of the file? If the image format used a very specific structure you might be able to get away with existing merge algorithms, but I wouldn’t count on it.
Now let’s push the idea a bit further. Let’s say we work on a coding project and we have an IDE that’s aware of our language and supports refactoring. In my branch I rename function1 to functionFoo. My IDE automagically changes all calls to function1 to functionFoo. Yay for refactoring! Meanwhile, my co-worker in his branch adds a bunch of calls to function1 but he’s not aware of my rename. When we’ll merge our work my VCS will see the changes this way: my branch - renamed function1 to functionFoo. His branch - added function2 and function3, both which invoke function1 at some point. But as our system knows what it tracks the merge is again very simple. First add his work to mine. Then apply the rename of function1 to functionFoo on his added work. The opposite is also valid for that matter - rename function1 to functionFoo in his branch and then take function2 and function3 and add them to my branch.
This can also be applied to more complex situations. Take the following C code for example:
void doSomething(int *i) {
int a = 2;
*i = a;
}
Again, me and my co-worker work on this same function. I change it to this:
void doSomething(int *i) {
int b = 3;
*i = b;
}
And he adds the following:
void doSomething(int *i) {
int a = 2;
*i = *i == 1 ? a : foo(a);
}
Luckily, we have my VCS to rescue us. What I did was to change the initial value of a and rename it to b, while he added a ternary operator. Again the merge is simple:
1. Change the initial value of a.
2. Add the ternary operator.
3. Rename a to b.
void doSomething(int *i) {
int b = 3;
*i = *i == 1 ? b : foo(b);
}
The key here is what makes all VCS fundamentally broken. They try to work from the outside rather than from the inside. The above can only be done if the history of the changes is generated at real time while we make them. There’s simply no way figuring what happened in these two examples if we weren’t watching the edits while they happened, or someone else told us explicitly what happened.
Fortunately, we already have that mechanism in almost every self respecting app. But we never really realized it. Remember that undo/redo menu?
So if my editors will agree to cooperate with my VCS the world will be a better place, at least for me.
This idea is nothing but new. In OSX the system already has undo support out of the box for CoreData based applications. So if we can just convince CoreData to export its undo history in a format my VCS can read (and hopefully modify), I believe about half of existing OSX apps will support my VCS out of the box. I don’t use other platforms, but everyone out there does undo/redo somehow, so they’ll just have to output something my VCS can read if they want to work with me.
Now if only someone will step up and develop this…