Easy Endian-ness

January 25th, 2006

A lot of my readers have noticed that this blog covers extremely geeky programming topics that often fly right over the heads of less technically-submerged people. It’s not an intelligence thing – it’s all just jargon and experience. So every once in a while I’d like to pull my head out of my assumptions about who is reading the blog, and tackle something considerably more “simple.” Topics that you must understand if you’re a developer, and may want to understand as a curious non-developer.

Tim Gaden points out on his Hawk Wings blog that the term “endian-ness” has become widely used in the Mac community, and it’s leaking out of the labs and into common conversation with customers. He astutely observes that many people haven’t got the foggiest idea what it means.

In the context you’ve probably been hearing it lately, it has to do with differences between two computer chip architectures – particularly Intel and PowerPC.

PowerPC is a “big-endian” while Intel is “little-endian”.

A hexadecimal (hex, base 16) number uses the numerals 0-9 (like decimal) but adds the letters A-F, so it can represent larger numbers with a shorter number of characters. Other than that, it’s constructed like a normal decimal number, when talked about and written by humans. For instance, here’s a hex number:

0x11ff

(The 0x at the beginning is common shorthand for “this is a hex number”).

On a PowerPC system, that is exactly how the number is stored in memory. The “Big End” is the end farthest to the left. This is what you’re used to with decimal numbers. It’s why 3,000,000 dollars is a lot of money. Because the characters that “mean a lot” are on the left. On an Intel machine, the number is stored “with it’s little end in first” – so it looks backwards:

0xff11

As a very practical example, let’s say I run a program that stores a directory of all my friends. It was written for a PowerPC computer, and as part of its data format, writes the total count of friends as a number:

0x0001

You don’t have to be an expert with hex numbers to guess that the “logical value” of this number is ONE. Now, what happens when this program is run on an Intel machine? If it just reads it verbatim from disk, it ends up looking exactly the same in memory as it did on the PowerPC, but it means something completely different. Since the “little end” goes in first on an Intel machine, the number is interpreted backwards, and becomes equivalent to this PowerPC-based value:

0x0100

So instead of 1 friend, the program running on Intel thinks I have 256 friends. Hey, I like this byte-order problem! But when the program proceeds to try reading those 255 non-existant friends from disk, your data format is hosed, and the application either crashes or behaves very strangely.

The solution for most of these problems is something called “byte swapping.” This makes it the responsibility of the programmer to ensure that bytes that went to disk on whatever architecture come back into memory in the appropriate format for the current chip. A byte on a computer is exactly the amount of data that uses up two characters in a hex number. So using the above example, the bytes in question are “0x00″ and 0x01”. If the bytes went to disk in big-endian format (0x0001), they need to “trade places” when they’re read back on Intel, so that they still mean ONE in little-endian (0x0100).

Apple has done a lot of the work for developers in this transition. Thanks to the growing use of highly abstracted data formats, Apple was able to handle the grunt work for things like preferences storage automatically. But for developers with custom data types, who have not planned for endian-ness issues, the announcement that Apple would be moving to Intel was a major wake-up call. They’d have to revamp their data storage and retrieval strategy so that they were always capable of “doing the right thing” regardless of the architecture. For most Mac developers, this means “assume it’s always big-endian, and byte swap if necessary.” This assumption might change over time if little-endian processors like Intel’s end up being what the Mac sticks with. But for the time being, assuming big-endian means that the data formats can be passed seamlessly between existing applications and their Intel-savvy counterparts.

The length of even this “easy” overview of endian-ness proves that it’s actually a complex and difficult concept. I have barely scratched the surface but I hope this helps put things into a bit more perspective. The next time you hear geeks yammering on about endian-ness, perhaps you’ll have a quip or two to interject!

Posted in Intel | 23 Comments »

WordPress 2.0

January 24th, 2006

If you’re reading this, then the upgrade to WordPress 2.0 worked.

Of course it worked! Those guys (and gals?) are flippin’ brilliant. I had been putting off doing the upgrade because I’m busy, and it’s always daunting to deal with upgrading web packages. Migrating database formats, making sure plugins still work, etc., etc. Well, I’m not saying that WordPress users never face those problems, but the WP install/upgrade process is so easy and surprisingly error-free … it’s like using a Mac!

I literally just renamed my old blog directory “blog.bak”, dropped in the new WordPress 2.0 folder, copied my tweaker plugins and theme over, and went to the WordPress “upgrade.php” URL.

I especially like how the upgrade.php page says (perhaps as homage to a famous iMac commercial) “There’s actually only one step. So if you see this, you’re done. Have fun!”

In my opinion projects like WordPress represent the best hope for the future of open source, free software. You go, WordPress! I wish I still lived in the Mission so I could buy them a beer at their next party.

You’re probably scratching your head and wondering what the heck the big deal is. Everything on this blog looks exactly the same. That’s the big deal. I totally pulled the rug out from under my blog and replaced it with astroturf, but nothing fell apart. The advantages to me behind the scenes are enormous, though. The “administrative” interface is much improved, with some “ajaxy” stuff, and general polish around the edges. Apparently the plugin format is greatly re-architected for the promise of even better plugins down the road. They even started including a comment-spam filter plugin by default. Let’s hope it works better than my previous solutions. I’m so sick of seeing the word “phentermine” pop up in my comment moderation queue (though I recently figured out I could add it to a WordPress blacklist – so don’t mention it in your comment here or else it will meet an instant death).

One of the things I plan to pursue with my shiny new WordPress installation is some kind of mechanism where I can add “non-essential” feeds to my existing primary RSS. It’s easy for me to expose a feed for just one category, but I don’t know yet how to expose a feed for a single category *and* keep that category out of the mainline feed. For example, I’d like to start something akin to Daring Fireball’s Linked List, but am not interested in pushing those perhaps frivolous and frequent “entries” out to all the kind subscribers of my main blog. Similarly, I often get the urge to post LazyWeb entries, but don’t want to soil my relationship with any borderline subscribers :) It would be great if I could offer a separate feed so people who would love to make my day by answering a question would have something to read!

On the subject of feeds, Eric Albert of Out of Cheese recently asked me if there was a method of subscribing to all of the comments made to the blog. He observed that my increasingly sophisticated readership manages to inject some pretty interesting information “below the fold” on some of my entries. While each entry offers a link for subscribing to that particular entry’s comments, I had never noticed a method for subscribing to the whole kit and caboodle. But I noticed on WordPress’s administrative interface, that it showed me a running tally of just such a thing. A little research discovered a variation on the RSS URL that does just that. I was pleased to see that it continues to work as expected in WordPress 2.0. I realized during this examination that I didn’t previously list the basic “subscribe” link anywhere on the page. For those unlucky enough to be running a browser that doesn’t put a big obvious “RSS” button in your address bar, that might come in handy. So I’m adding two links to the main sidebar: one for the blog, and one for the comments. I’ve said it before: my readers make this place worth coming back to. I love the interplay between “really smart people” that takes place after I’ve left the building. Hopefully by subscribing to the comments feed, you’ll be able that much more likely to notice when you have something valuable to contribute.

Update: One of the things I’ve been unhappy with for some time is the “permalink format” for my blog entries. It used to be something like:

http://www.red-sweater.com/blog/?p=68

Pretty darned meaningless to any human. But with WordPress’s customizable permalink formatting, I was able to easily change it to the following:

http://www.red-sweater.com/blog/68/kids-in-the-park

I think the new structure does a good job of being guaranteed unique (the numeric part) while also offering the human-readable slug, all without getting too insanely long.

The best news of all? The old format still works, so all the existing links out there resolve as expected to the original post. If any other WordPress users like the format I’ve adopted, it’s easy to set up. Just enter this as your custom permalink structure format: “/%post_id%/%postname%”.

Update 2: The end of the honeymoon? After being perplexed by the permanent loss of significant and important attributes of my <div>, <img> and <ul> tags after uploading from MarsEdit, I learned that a WordPress 2.0 bug was the culprit. Thanks Gus for pointing me to Bill Bumgarner’s writeup, now a few weeks old! I even subscribe to his blog, but must have glossed over it since it wasn’t pertinent to me at that time. Fortunately, it looks like the core problem has been addressed by WP and is being slated for the 2.0.1 update. Until then, I’ll just be really careful what I post through the XMLRPC interface!

Update 3: I just updated to the 2.0.1 release of WordPress and, as promised, it seems to address the XMLRPC problem. I was able to sync and republish the originally problematic post with MarsEdit and didn’t lose any information! I’m glad the space of time between my becoming exposed to this problem and the release of a fix was very short.

Posted in General | 9 Comments »

Building CFNetwork

January 22nd, 2006

Update. Well I am a complete idiot. Of course I should have assumed that if I wanted to build something out of Darwin, other people in the past have wanted the same thing. I’m a fool for not noticing this page at opendarwin.org. The “darwinbuild” tool basically “follows the dark dependencies” for you. I haven’t confirmed that it can build CFNetwork yet, but I’ve started it on its way, and I have little reason to believe that it will fail. It sure is downloading a bunch of the same junk I (painfully) did.

Perhaps my only criticism is that it appears to be downloading way more than necessary. Naturally, it must be assuming a raw system with no developer tools installed. It’s downloading things like gcc and gnumake. Maybe there’s some way to coerce it into only downloading the “not installed by Xcode” bits.

Update 2: It didn’t succeed. Probably just fine-tuning, but I suppose for the time being there may be some value in my crazy efforts shown below.

Update 3:After some tweaking, I got CFNetwork to build with darwinbuild. Basically, the dependency tree they have for CFNetwork is good but not perfect. I will get in touch with them to see if I can get the correct dependencies added to the tree.

Before discovering darwinbuild, I managed to get CFNetwork built on my own. But it was a painful experience. The long and miserable story below is enclosed for historical snickering and contempt:

I mentioned recently how Apple’s open source resources can be quite helpful in “cutting to the chase” in some debugging scenarios. A major obstacle for most of us, however, is the fact that most of these open source projects are set up with peculiar Apple-internal build variables, conventions, etc. I have a hard time building these things, and I used to work in that environment! (At Apple, you just get used to having a system with “internal headers” installed across the board. Sigh. Because most Apple employees work in this type of environment, it’s no wonder the open source projects are largely unbuildable by us normal folks.)

But I don’t like being limited in this way, so I’m resolving to take matters into my own hands. I want to have these projects at my easy disposal, so I’m going to figure them out. For every project I need to build that “takes signficant work,” I’ll add a blog entry here detailing what was required. Hopefully this will serve as a convenient “google destination” for others who could benefit from getting something built, but don’t have the time to follow the endless tunnels needed to reach the destination.

Because I’m debugging some funny behavior in CFNetwork, a project I worked on briefly while I was at Apple, I’ll start there.

CFNetwork is quite a small project, but it’s much trickier to build than you might guess. It includes a number of dependencies which are not part of your standard “Xcode install” development environment, as well as some build peculiarities that we’ll have to work around. We’ll coax the project into thinking our dingy little bachelor pad is every bit as luxurious (?!) as the environment it’s gotten used to at Apple. Many of the dependencies are just on private header files that need to be collected from various open source projects, but CFNetwork also links against some funny security frameworks that need to be installed into /usr/local.

CFNetwork – Step by Step:

CoreFoundation framework with PrivateHeaders. The open-source flavor of CoreFoundation (368.25) includes much, but not all, of the CoreFoundation binary that shipped in 10.4.4. It is therefore unsuitable as a “drop-in” replacement for the copy of CoreFoundation that came with your copy of Mac OS X. Nonetheless, it can be useful to build if, for example, you want to use parts of CoreFoundation in an open source application on another platform. Or perhaps you want to play around with tweaking the sources for some common collection classes, to see if the performance can be improved. In our case, building it gives us a convenient copy of the framework with private headers in tact, and it’s actually easier to do this than to figure out how to get the private headers out in any other way.
CoreFoundation does not, thankfully, have dependencies on other Apple-internal sources or headers. There is just one minor issue to contend with if you’ve got gcc 4.0 set up as your default compiler (you probably do unless you changed it with gcc_select). CoreFoundation’s makefile passes the -Wno-precomp option to the compiler, which is no longer recognized by gcc 4.0. To work around this, you could probably remove the flag from the build options, but I just “went with the flow” and changed the compiler to 3.3. I did this by finding the following text in framework.make:
```
ifeq "$(PLATFORM)" "Darwin"
CC = /usr/bin/cc
else
```
And changing “cc” to “gcc-3.3”. After making this change, you should be able to type “make”, cringe at a few dozen build warnings, and eventually see the reassuring “Done!” feedback. We don’t really care if the binary is perfect – so I’m not going to fret about the warnings or whether I should have used gcc 4.0 for better code. The built-framework resides, by default, in the following directory:
```
/tmp/CoreFoundation.sym/CoreFoundation.framework
```
You’ll see that inside this framework exists the usual Headers directory as well as a PrivateHeaders directory containing “top secret” stuff!

The easiest way to accommodate this dependency is to point CFNetwork at the same “SYMROOT” (where built objects go) as CoreFoundation. Since the CFNetwork build includes its own SYMROOT as part of the framework search path, building to a common directory as CoreFoundation will cause it to find the custom CoreFoundation you built above. The SYMROOT can be specified on the command line, so for starters, instead of simply typing make in the CFNetwork source directory, type the following:
```
SYMROOT="/tmp/CoreFoundation.sym" make
```
If this “define a variable on the commandline” trick doesn’t work with your shell you’ll have to find another way of getting the definition into the Make context. If it does work, you’ll find that your build is still failing, but at least it’s finding the required CFPriv.h and associated headers.
Asynchronous NetDB. CFNetwork takes advantage of the asynchronous getaddrinfo calls that are linked into the libSystem library on OS X. Unfortunately, the header files needed to compile against these, netdb_async.h, is not included. I discovered, with the help of Google and Ian Lister, that these header can be obtained by downloading LibInfo (222.1) and doing a “make installhdrs”. I didn’t really want to soil my pristine system headers, so I just copied the pertinent header file over to the CFNetwork source directory:
```
cp lookup.subproj/netdb_async.h ../CFNetwork-129.9
```
Don’t feel like downloading the whole project just to grab the header? Here’s a direct link.

In the longer term, I could see the value of putting all this junk into a centralized “Darwin SDK” (does it already exist? if so, I’m wasting a lot of time here!). It would be cool to have an SDK among the others for 10.3.9, 10.4, etc., that specifically included all the extra junk you use to build common Darwin projects.
CDSA Security Pieces. Yet more crummy downloads you’ll have to make from the Darwin source repositories. CFNetwork relies on headers from libsecurity_utilities (25) and libsecurity_cdsa_utilities (16). The latter requires the former to be built and installed first. It also requires some private headers from the Security framework (25966), which in turn requires that you essentially download, build, and install all the various libsecurity_* projects. Ugh. Are you noticing a trend here? I just want to get the minimum I need to build this thing. So I’m going to take shortcuts left and right.

To the Security folks’ credit, they include Xcode projects for their sources. Unfortunately, they are no more “self contained” than the other projects I’ve attempted to compile. Another rat’s hole! Let’s start with libsecurity_utilities:

To build this project, I am going to tweak a few things in the project, and then build from the command-line with the xcodebuild command. I’m using Xcode 2.2, so I had to upgrade the project to the “.xcodeproj” format before proceeding.

This project shared similarities with CFNetwork. It relies on CoreFoundation’s private headers, and it also seems pre-determined to use gcc 3.3. I’m going to point it at my existing “CoreFoundation.sym” magic directory, since it happens to be there and conveniently serving CFNetwork at the same time. I’ll also specify a gcc version on the command line using the Xcode build option GCC_VERSION:
```
xcodebuild -target libsecurity_utilities GCC_VERSION=3.3 FRAMEWORK_SEARCH_PATHS=/tmp/CoreFoundation.sym/
```
This gets me “almost there”. For some reason when CFPriv is brought in here, it objects to the “non-umbrella” form of including headers directly from CarbonCore. I can’t remember how to fix this correctly, so I just take the easy way out. I add a symbolic link to my magic directory:
```
cd /tmp/CoreFoundation.sym
ln -s /System/Library/Frameworks/CoreServices.framework/Frameworks/CarbonCore.framework
```
The only problem now is the post-install script phase from Apple that pops the build products into the correct system path. I’m using “per-deployment build folders” in my Xcode preferences, but this script isn’t expecting that. So I open up both Script phases and replace all instances of “SYMROOT” with “BIULT_PRODUCTS_DIR”. Now it will always point at the correct directory, regardless of how the preference is set. With that, the build works, and I’m ready to install the resulting libraries:
```
sudo xcodebuild -target libsecurity_utilities GCC_VERSION=3.3 FRAMEWORK_SEARCH_PATHS=/tmp/CoreFoundation.sym/ BUILD_VARIANTS=normal DSTROOT=/ install
```
Yes, I’m taking a whiz on my pristine installed system, but in this case I don’t feel so concerned by it since the stuff is all neatly confined to /usr/local/SecurityPieces.

Are you still with me? We’re almost there! If you made it this far you deserve a medal. Now we’ve got to compile and install the libsecurity_cdsa_utilties project. Remember the aforementioned Security framework private headers? This is where those come into play. Specifically we need “SecTrustPriv.h” and “SecKeychainItemPriv.h”. These both come from the libsecurity_keychain (25886) project. I’m getting pretty tired of playing this game, so I’m just going to make a copy of the entire Security framework as it shipped on my machine, put it in my “magic” CoreFoundation.sym folder, and drop everything that looks like a private header in:
```
cd /tmp/CoreFoundation.sym
cp -r /System/Library/Frameworks/Security.framework ./
mkdir PrivateHeaders
cp ~/Sources/ExternalSources/darwin/libsecurity_keychain-25886/lib/*Priv.h PrivateHeaders
```
I’m sorry to report you’ll also need checkpw.h, from a little rinky dink project called libsecurity_checkpw. Just grab it directly and drop it into your magic Security framework.

Aside from the header dependencies, most things about this project look similar to the previous, so I try a slightly modified xcodebuild line, being careful to point it at the magic directory, where my “private headers” security framework now lives:
```
sudo xcodebuild -target libsecurity_cdsa_utilities GCC_VERSION=3.3 FRAMEWORK_SEARCH_PATHS=/tmp/CoreFoundation.sym/ BUILD_VARIANTS=normal DSTROOT=/ install
```
Success! Well, in a really pathetic, sort of almost dead from frustration kind of way. It turns out that the particular headers required by CFNetwork are not included in the libsecurity_cdsa_utilities project. They’re included in libsecurity_cdsa_utils (no “ities”), instead! (13). Oops, what a misread. But the “utils” project depends on the “utilities” framework being built and installed, so our time is not wasted. Just go through the same rigamarole as above one more time:
```
sudo xcodebuild -target libsecurity_cdsa_utils GCC_VERSION=3.3 FRAMEWORK_SEARCH_PATHS=/tmp/CoreFoundation.sym/ BUILD_VARIANTS=normal DSTROOT=/ install
```
It ended with no obvious error but a “Build Failed” message. Whatever. Ask me if I care? It got far enough to put the required header where CFNetwork wants it. At this point I’m starting to think I should have just downloaded all the “libsecurity” items and figured out how to build them kosher. But the important thing is, it doesn’t matter anymore. I’ve satisfied CFNetwork, and it’s time to move on to the next hurdle.
JavaScriptGlue (417). CFNetwork’s ProxySupport.c depends on something called JavaScriptGlue. I’m assuming this is so CFNetwork can support the javascript-based custom proxy specification files. Cool! I’m not really sure I need it for my testing, but heck I’ve come this far I might as well cover all the bases. Besides, it might be easier to placate it than to try to cut out functionality.

It turns out that JavaScriptGlue relies in turn on JavaScriptCore and who knows what else. Fortunately I’m getting a little more desperate and a little less curious. Maybe it means I’m also getting smarter. I don’t have to BUILD the damn thing, I just have to get its headers installed. I pull a similar move to the Security framework trick I did earlier, copying JavaScriptGlue.framework from /System/Library/PrivateFrameworks to my “magic” CoreFoundation.sym directory. Then I just pop the a Headers folder in there containing every file from the JavaScriptGlue project that ends in “.h”:
```
cp -r /System/Library/PrivateFrameworks/JavaScriptGlue.framework /tmp/CoreFoundation.sym/
cd /tmp/CoreFoundation.sym/JavaScriptGlue.framework
mkdir Headers
cp ~/Sources/ExternalSources/darwin/JavaScriptGlue-417/*.h ./Headers/
```
Security Framework PrivateHeaders. Oh! Security again! Didn’t we already deal with that about 15 pages ago? We only dealt with the parts needed to get the funky cdsa frameworks building. The problem with the Security framework is it’s a weird amalgamation of code from all those libsecurity* projects. The headers come from all across those projects, too. We grabbed some of the headers but not all of them. So our magic Security.framework isn’t quite magic enough. CFNetwork wants SecureTransportPriv.h, which I found in the libsecurity_ssl project. Grab it directly and drop it in your magic Security.framework. You’ll also need ASN1 headers from libsecurity_asn1 (9). It’s brute-force time – just grab all the *.h you can find and drop them into Security.framwork

A Short Kiss Goodbye. When all is said and almost done, you’ll see a familiar set of errors in CFNetwork, having to do with that pesky -Wno-precomp flag. Either obliterate it from the makefile, or change your compiler to 3.3. You’ll want to be sure to change both the “cc” definition and the “g++” one.

Summary

Well, there you have it. How to build CFNetwork in just a few easy steps. In retrospect, I could probably do some things a bit faster and more logically. But the problem is, when you’re looking at a source project from outside the company, you don’t know what you’re going to need ahead of time. You start building and just keep hoping that every tweak or accommodation will be the last one you need to make.

My experience getting CFNetwork to build informs an opinion that people interested in building it regularly should bite the bullet, download and figure out how to cleanly build their own copy of all the Security stuff. Just do it once and start building your own private “Darwin SDK.” I don’t have the time or energy to do that now, but you can bet I made a copy of my “CoreFoundation.sym” folder. Next time I need to build something like this, I’ll be starting with magic in hand. If somebody does start putting something together, maybe it will be shareable under the same license as all of the projects from which it was composed (probably APSL).

Since many of the “private headers” are available via the variety of open source projects, it’s a shame Apple doesn’t just make such an SDK available. They have an internal installer that puts Apple engineers in good shape for this stuff, so they would have something to start with in building such a thing.

(If they do, by some chance already do this, please don’t tell me about it. Give me a week or so to feel like a badass for compiling a CFNetwork of my very own.)

Now, about that bug…

Posted in Apple, Debugging, Programming | 1 Comment »

HIDeous Adventures with Open Source

January 21st, 2006

I’ve been pretty busy lately, and spread thin over a wide range of projects. One of the “new to me” experiences involves supporting a custom HID device via Apple’s IOHIDLib API.

The IOHIDLib API allows plain-ol folks like me to interact with HID-conformant USB devices without installing any kernel-level drivers or extensions. Let me say, this is great for both developers and users. Fewer cooks in the kernel kitchen means fewer crashes. (Except when Apple’s in the kitchen – crashing!)

I haven’t spent so much time testing the “reboot” feature of my machine since I was a Mac OS 8 System File engineer. Without saying too much about the device I’m supporting, I’ll say that it has LED outputs, like the caps-lock light on your keyboard. I’m in the early stages, so my program basically consists of a test routine that sprays eternally at the LEDs, making them do fun yet useless things.

It was a moment of triumph and joy when I finally got my head wrapped around the HID libraries such that I could make my device “dance.” Thanks, Apple DTS, for the excellent sample code and support. However, my joy was short-lived, as I noticed that after some number of dancing iterations, the LEDs just stopped. My program had frozen. And when I say stopped, I mean stopped beyond any conceivable point of revival. Force-quit can’t quit it. GDB can’t attach to it (!). Sample can’t sample it. This bad boy is just gonna hang out in my process list and dock until I restart the machine.

After much scrutiny of my own code, I did what I should have done from the beginning. I turned on Guard Malloc. I’ve sung the praises of this tool on the mailing lists but I don’t think I’ve mentioned it here yet. This debugging aid slows your application to an absolute crawl as it intercepts all memory allocations and sticks protected memory pages between every malloc’d block. The end result? The vast majority of your “overrun the array” type bugs are caught dead in their tracks.

I turned on Guard Malloc and launched my app. I started to get up for coffee, because as I said, Guard Malloc slows your app to molasses. To my surprise, the application had crashed almost instantly. Hmm – I wondered if this was the same bug that caused the freeze or just something else to look into (joy of joys!). In any case, I wasn’t going to be able to get to the bottom of the problem until I cleared all obstacles in the path. I sat back in my seat and started looking for clues.

Hmm, I’m crashing inside IOHIDLib. Surely I must be screwing something up. But what? In HID-ese these are called elements. You basically ask IOHIDLib to fetch your device, then you open up all the elements you want to write to. In HID-ese writing to the device is called a “transaction.” After you’ve configured a HID device and are talking with it, the rest of your application’s life consists of setting element values (e.g. LED on/off), committing the transaction (send it all over to the device), and then clearing the transaction (so you start out with default values next time).

I whittled my test app down to a simpler case:

Open a device.
Configure a one output transaction on the device.
Clear the transaction.

Sure enough, my simple little applications crashes in the IOHIDLib with just these simple steps (when Guard Malloc is on – with Guard Malloc off you crash at some undetermined time down the road).

Crap. Crashing in Apple’s code. I’m going to have to get help from Apple, or else do some serious disassembly hacking to figure out what’s going on. Then I remembered that IOHIDLib is open source!

I downloaded IOHIDFamily-172.8. After a few tweaks (Apple’s open source projects often rely on Apple-Internal build paths and/or header files – but often you can work around it and get a working build), I was able to build a copy of the IOHIDLib.plugin bundle. With debugging symbols enabled, debugging this was going to be a breeze! I took a deep breath and copied my debugging binary over the Apple-supplied production version (after making a backup):

sudo cp ./IOHIDLib /System/Library/Extensions/IOHIDFamily.kext/Contents/PlugIns/IOHIDLib.plugin/Contents/MacOS/

Phew! I can still use my mouse! Whenever I tweak system-level stuff I’m always a tiny-bit scared that I’ll end up making some foggy mistake that leaves me “bringing my system back” via SSH or single-user mode.

With the debugging IOHIDLib installed, all I had to do was re-run my application (Guard Malloc still on!). Sure enough, it dynamically picked up the new version of the IOHIDLib, and crashed with full source code at the offending line. The source file is IOHIDOutputTransactionClass.cpp and the crashing code is:

for (int i=0; elementDataRefs[i] && i<numElements; i++)

What’s wrong with this code? If you look carefully you’ll probably figure it out. The problem is in that continuation test. “While elementDataRefs[i] is not zero, and i is less than numElements, keep doing stuff.” This is a case of mixed-up order of precedence. In C, the “&&” operator evaluates left to right. So if you’ve got a dangerous test that’s only safe when a benign test is true, you have to write it “benign && dangerous”. If benign is never true, then dangerous never gets run. But here we always do the dangerous act, array indexing, before testing the index value! This code occurs in three separate places, each of which corresponds to a common IOHIDLib API for handling transactions with HID devices. No matter how many “output elements” I configure on my HID device, the IOHidLib is always going to read beyond its legal limit.

Now, in the vast majority of cases, the 4-bytes (indexing a long int) that immediately follow this array will probably not be part of a protected page. Heck, the fact that countless numbers of HID-interacting programs apparently deal with these APIs every day and don’t crash basically proves that. But it scares me. And if I ever see a random crash on my system with a similar stack crawl, I’ll know just who to blame!

It is precisely to avoid these types of bugs that I am not a huge fan of these “complicated for loops.” I would probably write this function in an elaborated, completely “too uncool for C school” form where the for loop only tests for “i<numElements” and the secondary (dangerous) test gets its own if-block inside the for loop. This removes all doubt. But for the purposes of quickly working around this crash, I replaced the three instances of the above code with the following:

// DCJ - Splashes beyond array bounds when (i == numElements)
// for (int i=0; elementDataRefs[i] && i<numElements; i++)
for (int i=0; i<numElements && elementDataRefs[i]; i++)

With the fix in place, I was able to continue debugging my application. I fired up the debugger again with Guard Malloc enabled. I waited anxiously to see if my LEDs would keep flashing, or whether another crash would be uncovered.

Unfortunately, it just froze again. Yep, the nasty “can’t kill the application process in any way except rebooting” freezing. In this case, the bug I found in Apple’s open source didn’t turn out to be my bug. I’ll have to keep looking for the root cause of that. But the availability of this open source allowed me to fairly quickly get past it and move on to the next step of debugging. And that’s always a good thing. Thanks, Apple!

(IOHIDLib reading past array bounds reported as Radar 4417524)

Posted in Debugging | 5 Comments »

« Previous Entries

Next Entries »

About

The Red Sweater Blog features updates about our apps and articles about related topics.
Try Our Mac Apps

MarsEdit. Write, edit, and publish your blog from your Mac.

FlexTime. Stick to your routines throughout the day.

FastScripts. Attach keyboard shortcuts to your scripts.

Black Ink. Solve crossword puzzles with style.