RegExKitLite v. Clang
July 23rd, 2010I finally updated to Xcode 3.2.3. I was a little late because I had other priorities, but I wanted to get my iPhone projects building and installing onto my iPhone OS 4.0 devices, so I decided to download and install the latest SDK.
Unfortunately, this seemingly minor update presented a major failure in my debug builds, which I’m adventurously compiling using the LLVM clang compiler. They built fine with Xcode 3.2.2, but starting in 3.2.3, I got a linkage failure, tracing back to the popular open source RegExKitLite:
Undefined symbols: "___gxx_personality_v0", referenced from: _rkl_debugCacheSpinLock in RegexKitLite.o ... "__ZSt9terminatev", referenced from: +[NSString(RegexKitLiteAdditions) clearStringCache] in RegexKitLite.o
What? Those symbols smack of C++, but there isn’t any C++ code in my project. At least, I don’t think there is. RegExKitLite is so famously advanced in its designed, I couldn’t easily tell you whether there was stuff in there intended as C++ or not. On a whim, I tried to switch the file type to C++ to see if that made clang any happier. No, it screamed bloody murder. It agrees with me, RegExKitLite is not C++ code.
So why the link errors? I examined the compile line carefully for RegExKitLite and confirmed that it’s using clang in a manner that should only be generating plain C linkage:
/Developer/usr/bin/clang -x objective-c -arch x86_64 ...
But when I look at the resulting RegExKitLite.o with the “nm” command-line tool, it shows the culprits symbols have been listed as references in the binary object file:
% nm RegexKitLite.o | grep -e terminate\\\|personality U __ZSt9terminatev U ___gxx_personality_v0
Normally when I’m getting some kind of issue like this, I just type in the problematic symbols in to Google and usually find somebody else has found and solved the problem. No such luck this time. There are lots of false positives for the typical reason somebody runs into this link error: they are compiling legitimate C++ code but neglecting to link with the stdc++ library. But in our case, we are not compiling C++ code, but we’re ending up with C++ dependencies nonetheless.
I hopped on to the #clang IRC channel on irc.oftc.net, where a couple extremely helpful clang engineers (thanks to dgregor and rjmccall) helped talk me through the problem, and determine that it does seem to be a clang 1.5 bug. They confirmed that with their latest and greatest clang the bug seems to be fixed, but I was curious to confirm it myself, so I checked out and built the latest clang, too. Sure enough, all is well for future generations!
But what do we do now, stuck with the clang 1.5 that ships with Xcode? There is a crude workaround: simply link to stdc++. But yuck! All that C++ linkage dirtying up my pristine Objective-C project? Since I only use clang (for now) to build debug builds, I don’t have too much of a problem doing this. But if I was shipping these binaries I would be very hesitant to make that concession.
Radar #8230225: clang 1.5 forces linkage to C++ for non-C++ source file
(Open Radar Link)
The best we can do is report the issue to Apple and let them decide whether it’s worth their time to backport whatever fix in later clang is missing from clang 1.5. If you have other suggestions for how to work around the problem in the mean time, please do chime in with a comment below.
Update: The author of RegExKitLite kindly chimed in below with a pointer to the revision in the clang source repository that that fixes the issue. He also suggests a simple, and seemingly safe workaround, involving a slight tweak to the RegExKitLite source code.