Snippets update

August 9th, 2007
  • Updated DPRunLoopSource to v0.2 which adds thread-safe reference counting (Apple’s implementation isn’t thread-safe).
  • Added DPQueue - a thread-safe two-lock concurrent queue implementing the Michael & Scott algorithm.

Coroutines!

August 4th, 2007

After thinking about blocks for a long time, I came to realize they are simply a private case of coroutines. A quick search showed few implementations for C so I decided to give it a shot (I only tried to get coroutines to work at this time, so no real blocks yet).

I picked the libCoroutine implementation since its what Io uses, and I heard good things about these guys. So after few hours of coding and about three days of debugging, here is what I came up with. First, the single threaded version:

@interface DPCoroutineSingleThreadTests : DPTestCase {
        int32_t count;
}

@end

@implementation DPCoroutineSingleThreadTests

- (void)incByTwo {
        while (count < 10) {
                count += 2;
                DPCoroutineYield();
        }
}

- (void)decByOne {
        while (count < 10) {
                –count;
                DPCoroutineYield();
        }
}
- (void)finishSingleThreadTest {
        while (count < 10)
                DPCoroutineYield();
       
        [self setStatus:YES forPendingTest:@"testSingleThread"];
}

- (void)testSingleThread {
        count = 0;
        [self coroutine:MSG(finishSingleThreadTest)];
        [self coroutine:MSG(incByTwo)];
        [self coroutine:MSG(decByOne)];
}

@end
 

OK, let’s see what we have here. First, we define a test case with a simple counter. In our test’s beginning, we call [self coroutine:] three times, each with a different message. The -[NSObject coroutine:] method creates a coroutine with the given message and adds it to the current thread’s coroutine stack (it’s actually a queue - FIFO - but in my first attempt it was a stack).

Each thread can have as many coroutine stacks as needed, but you’ll probably never need more than one. Anyway, this stack is actually a runloop source which is automatically added to the main thread’s runloop for you (only to the default mode, and for other threads you have to set this up yourself). Each time the stack gets fired by the runloop, it executes a coroutine. The coroutine then returns control to the stack when either DPCoroutineYield() was called, or the method returned. If the method returned, the coroutine ended and is removed from the stack. If the coroutine yielded its added again to the bottom of the stack and will resume execution sometime in the future.

So in our example above, three coroutines are created. The first, finishSingleThreadTest, checks if the counter has reached 10. The second increases the counter by two, and the third decreases by one. Note that each yields after each step, so others get a chance to do their thing.

Coroutines allow you to write much simpler code in some cases, but they can also be used as lightweight threads. Coroutine switches (i.e. DPCoroutineYield) should be much much cheaper than OS threads (haven’t profiled yet). When running in a single thread, they also save the huge headaches that come with thread synchronization. But in case you really care about speed, DPThreadPool allows you to share coroutines across thread like this:

- (void)incByOne {
        while (OSAtomicIncrement32Barrier(&count) < 5) {
                DPCoroutineYield();
        }
}

- (void)checkCompletion {
        int32_t c = count;
       
        while (c < 5) {
                DPCoroutineYield();
                // Is this really needed?
                OSMemoryBarrier();
                c = count;
        }
       
        [pool release];
        [self setStatus:YES forPendingTest:@"testMultiThreaded"];
}

- (void)generateJobs {
        [pool sendCoroutineMessage:MSG(incByOne)
                                                        to:self];
        [pool spawnThread];
        [pool spawnThread];
       
        [pool sendCoroutineMessage:MSG(checkCompletion)
                                                        to:self];
       
}

- (void)testMultiThreaded {
        count = 0;
        pool = [[DPThreadPool alloc] init];
       
        // Let UITestingKit mark us as a pending test.
        // Yes, its a bug but I’m not feeling like fixing it right now.
        [self receiveMessage:MSG(generateJobs)
                          afterDelay:0.1];
}
 

When you create a DPThreadPool instance, it sets up a single coroutines queue that’s shared across all of its threads. All coroutines added to the pool are then divided by all threads to share the load. Since you can’t know which thread will execute each coroutine (and often a single coroutine will be executed by many threads until it completes), the coroutine code must be thread safe as if the coroutine was an OS thread.

All of this is very nice but there’s one major gotcha. You can’t yield inside a @try/@catch block. Doing so will completely confuse foundation’s exception handling implementation and you probably don’t want that. I think it’s not such a big deal, but it means you must be a little careful with what calls you use in your coroutine.

My VCS

July 27th, 2007

There are many different VCS (version control systems) out there these days. Each has its own strengths and weaknesses, but they all seem to attack the problem in the wrong direction.

Today’s VCS are no longer simple versioning systems. Every self respecting system has branches support in some way or another, and they are all targeted at a group of developers. They all publish themselves as some form of content tracker, most noticeable is probably Git which shouts its “a dumb content tracker” at anyone who agrees to listen (don’t get me wrong, I like Git).

But isn’t it ironic that all systems don’t really track our real content? Let’s look at Git again for that matter. It tracks the content of files from the machine’s perspective, i.e. the actual bytes of files. There’s nothing wrong with this except it’s not the content I’m interested at. When I add a sentence to my text file in my text editor I’m NOT adding a bunch of bits to the file. I’m adding a sentence and that’s it. I shouldn’t care how my editor writes it, how the underlying OS stores it, and what kind of dark magic my VCS is going to use in order to merge it with some other guy’s sentence he added in his own branch. Bottom line is my content is the sentence I added in the context in which it was written.

So any system which tries to interpret my sentence as a bunch of bits added or removed to a file is simply wrong. And if it later tries to merge my work with another work, a successful merge is simply good luck as far as I’m concerned. It’s like I’ll try to merge two texts written in Japanese, which I can’t read, write, talk or understand. Honestly, when was the last time you tried to merge two texts you can’t read?

Now let’s take a look at how a smart system who tracks only what it understands will look like. Let’s say I’m creating an image in Photoshop with a bunch of layers and commit. Now my co-worker creates a branch with that image and both of us update it independently. I’m changing the text in layer 3 while he adds a shadow to this same layer. Both of us now save and commit, and now we want to merge. Since our system understands our content the merge is simple. The new image will have my text in layer 3 plus his shadow. Simple. But how would you do it if you track only the raw bits of the file? If the image format used a very specific structure you might be able to get away with existing merge algorithms, but I wouldn’t count on it.

Now let’s push the idea a bit further. Let’s say we work on a coding project and we have an IDE that’s aware of our language and supports refactoring. In my branch I rename function1 to functionFoo. My IDE automagically changes all calls to function1 to functionFoo. Yay for refactoring! Meanwhile, my co-worker in his branch adds a bunch of calls to function1 but he’s not aware of my rename. When we’ll merge our work my VCS will see the changes this way: my branch - renamed function1 to functionFoo. His branch - added function2 and function3, both which invoke function1 at some point. But as our system knows what it tracks the merge is again very simple. First add his work to mine. Then apply the rename of function1 to functionFoo on his added work. The opposite is also valid for that matter - rename function1 to functionFoo in his branch and then take function2 and function3 and add them to my branch.

This can also be applied to more complex situations. Take the following C code for example:

void doSomething(int *i) {
  int a = 2;
  *i = a;
}

Again, me and my co-worker work on this same function. I change it to this:

void doSomething(int *i) {
  int b = 3;
  *i = b;
}

And he adds the following:

void doSomething(int *i) {
  int a = 2;
  *i = *i == 1 ? a : foo(a);
}

Luckily, we have my VCS to rescue us. What I did was to change the initial value of a and rename it to b, while he added a ternary operator. Again the merge is simple:
1. Change the initial value of a.
2. Add the ternary operator.
3. Rename a to b.

void doSomething(int *i) {
  int b = 3;
  *i = *i == 1 ? b : foo(b);
}

The key here is what makes all VCS fundamentally broken. They try to work from the outside rather than from the inside. The above can only be done if the history of the changes is generated at real time while we make them. There’s simply no way figuring what happened in these two examples if we weren’t watching the edits while they happened, or someone else told us explicitly what happened.

Fortunately, we already have that mechanism in almost every self respecting app. But we never really realized it. Remember that undo/redo menu? ;-) So if my editors will agree to cooperate with my VCS the world will be a better place, at least for me.

This idea is nothing but new. In OSX the system already has undo support out of the box for CoreData based applications. So if we can just convince CoreData to export its undo history in a format my VCS can read (and hopefully modify), I believe about half of existing OSX apps will support my VCS out of the box. I don’t use other platforms, but everyone out there does undo/redo somehow, so they’ll just have to output something my VCS can read if they want to work with me.

Now if only someone will step up and develop this…

hdiutil causes kernel panic?!

July 18th, 2007

So this morning I tried to create a bzip2 compressed image with the command hdiutil create -srcfolder /path/to/dir -format UDBZ myImage.dmg. Return. Then to my great surprise a kernel panic. WTF?! I know disk images are powered by the kernel at some level, but a kernel panic with such a simple thing??

Anyway, after a restart the above command worked once or twice, and then started to shout “load_hdi: timed out waiting for driver to load” at me without accepting ^c.
A quick look in system profiler shows no IOHDIXController extension, which is (according to man) what load_hdi is supposed to load.

So does anyone know what’s going on? BTW, the kernel panic log doesn’t show much, at least for me, but here it goes:
Unresolved kernel trap(cpu 0): 0x400 - Inst access DAR=0x000000000F916014 PC=0x0000000000000000
Latest crash info for cpu 0:
Exception state (sv=0x2772AA00)
PC=0x00000000; MSR=0x40009030; DAR=0x0F916014; DSISR=0x40000000; LR=0x000912A8; R1=0x1242BCD0; XCP=0x00000010 (0x400 - Inst access)
Backtrace:
0x00000003 0x00091514 0x00044C18 0x0002921C 0x000233F8 0x000ABCAC
0xFFFFDF55
Proceeding back via exception chain:
Exception state (sv=0x2772AA00)
previously dumped as "Latest" state. skipping...
Exception state (sv=0x2772A780)
PC=0x9000B348; MSR=0x0200F030; DAR=0x0F916014; DSISR=0x42000000; LR=0x9000B29C; R1=0xBFFFF240; XCP=0x00000030 (0xC00 - System call)

Kernel version:
Darwin Kernel Version 8.10.0: Wed May 23 16:50:59 PDT 2007; root:xnu-792.21.3~1/RELEASE_PPC
panic(cpu 0 caller 0xFFFF0004): 0x400 - Inst access
Latest stack backtrace for cpu 0:
Backtrace:
0x000952D8 0x000957F0 0x00026898 0x000A8004 0x000AB980
Proceeding back via exception chain:
Exception state (sv=0x2772AA00)
PC=0x00000000; MSR=0x40009030; DAR=0x0F916014; DSISR=0x40000000; LR=0x000912A8; R1=0x1242BCD0; XCP=0x00000010 (0x400 - Inst access)
Backtrace:
0x00000003 0x00091514 0x00044C18 0x0002921C 0x000233F8 0x000ABCAC
0xFFFFDF55
Exception state (sv=0x2772A780)
PC=0x9000B348; MSR=0x0200F030; DAR=0x0F916014; DSISR=0x42000000; LR=0x9000B29C; R1=0xBFFFF240; XCP=0x00000030 (0xC00 - System call)

Kernel version:
Darwin Kernel Version 8.10.0: Wed May 23 16:50:59 PDT 2007; root:xnu-792.21.3~1/RELEASE_PPC
*********

A binary distribution of Git

June 29th, 2007

For anyone interested, I’ve built a UB version of expat and git and packaged them in an installer package (cogito is included as well). Please let me know if you find it useful :)

The package can be downloaded here. It includes expat 2.0.1, git 1.5.2.2 and cogito 0.18.2. Don’t forget to add /usr/local/bin to your path after installation is complete.

git-update-infoplist

June 8th, 2007

git-update-infoplist now has its own page.

UITestingKit v0.1RC2

June 7th, 2007

0.1 RC2 is now up. Probably the most noticeable changes are betting handling of paths passed to TestsRunner and support for running async tests more than once. The complete changes log is available here.

Download

So what else can we do with futures?

May 29th, 2007

After adding an API for a worker thread, let’s see what else can be done with HOM+futures.

First of all, there’s the future method described in Marcel Weiher paper that takes a message and invokes it with the receiver in a new thread. When the message returns the thread simply terminates. And here it is:

@interface NSObject (DPAsyncMessaging)
- (id)future:(DPMessage *)msg;
@end
 

That’s easy, but we can create more complicated systems as well. Like a pool of worker threads. Say hello to DPThreadPool: Read the rest of this entry »

Who likes multi-threading?

May 27th, 2007

I’m in no way a multi-threading expert, but here is my attempt to utilize HOM for easy multi threaded code.

In the spirit of -[NSObject performSelectorOnMainThread:] I added a bunch of methods to NSThread:

@interface NSThread (DPWorkerThread)
+ (id)detachNewWorkerThread;
- (void)terminate;
- (id)sendMessage:(DPMessage *)msg to:(id)obj;
@end

The first method spawns a new thread and set up everything a worker thread needs. This includes an autorelease pool, a runloop, and a queue of invocations. The second method simply terminates the worker thread, or does nothing if the thread wasn’t created with +detachNewWorkerThread. And finally, -sendMessage:to: is the really interesting part.

-[NSThread sendMessage:to:] does a bunch of useful things. First, it creates an invocation from the message and the object. Then it adds the invocation to the queue of the thread, and returns a future if appropriate (explained bellow). The thread’s runloop then empties the invocations queue when it has the time, and our message gets sent in that thread. Easy, right?

Now let’s get back to that future we got in the caller thread. What is a future you ask? Well basically, a future is an object that doesn’t yet exist. That object is usually the result of a long computation that hasn’t yet completed (when talking about an eager future). Check out the page at wikipedia for more info.

So the call to -[NSThread sendMessage:to:] returned us a future if it found it appropriate, or in other words, if the message returns an object. That proxy then stands for the return value of our message. When we’ll first attempt to message the future, it’ll check if the result of our message is available. If not, it’ll block the caller thread until available. When the result is ready, it’ll just forward any message to it.

But the API is not limited to this usage case only. An invocation queue can be shared between any number of threads in order to easily split the work across CPUs. It can also be used to implement a special proxy that’ll execute any message it receives in another thread, and probably many other crazy ideas I didn’t think of ;)

It hasn’t been tested a lot yet, but the code is in the HOM3 repository.

Embedding Git’s commit ID in Info.plist

May 25th, 2007

It became almost a standard to include your Subversion revision number with your built product. It’s usually done by appending the revision number to the CFBundleVersion key, which unfortunately can’t be done with Git (or any other VCS that uses hash codes as commit IDs). But instead, we can add the commit ID as a separate key to our plist, which we can later use to identify the build and/or display to the user. Read the rest of this entry »