I ran into a search problem recently in one of my Xcode projects. Every time I conducted a particular search (in this case, the search term was “CFString”), my screen was littered with a series of cascading Xcode exception dialogs that look like this:
On the one hand, it’s great that Xcode goes out of its way to convey internal exceptions to the user. It means that they are less likely to just “let slide” exceptional conditions that would otherwise show up only in an archaeological dig through the system’s console log. On the other hand, when they spray dozens of them out at me while I’m just trying to search my project, it gets to be a real hassle to close them all after every search.
When I run into these kinds of errors from Xcode, I immediately start thinking of where my latest project backup is. It’s probably some kind of internal project format corruption. The last thing any of us wants to do is trash a project file and start from scratch, so it’s a good idea to keep backups of the project files (probably along with all your other sources, in the Subversion repository).
In this case I thought I’d be curious and try to get to the bottom of the problem. Since each of these dialogs corresponds to a raised NSException inside Xcode, I thought I’d attach with gdb and see what’s going on at the point of exception. This is what the top of the stack looks like while one of the annoying dialogs is being generated:
Something to do with bookmarks! Well, I don’t even have any bookmarks in my project. Did I screw something one of those times I hit Cmd-D instead of Cmd-Shift-D to “Open Quickly?” I decide to set a breakpoint on the call to initWithBookmark:textBookmarkResolvable:, along with some exploratory breakpoint commands. I asked gdb to report to me what the bookmark argument looks like (“po” is short for “print-object”, useful for displaying Cocoa objects). I left my “raise” breakpoint, set to automatically continue, so I can tell what kind of circumstances immediately precede it. This is a “trap the thief” technique for pinpointing the misbehaving code. I let Xcode run wild and then look at the logged breakpoint information for clues as to what led to the assertion.
With the breakpoints in place, I return to Xcode and invoke my troublesome search. Looking back at the logged output in gdb, I see the following snippet (newlines added for readability):
So, Xcode doesn’t like “HackCFSocketStream.h”? Aside from its obviously questionable name, what has it done so wrong? I take a look at my search results and sure enough, no HackCFSocketStream.h. Not only are the dialogs I’m getting annoying, they represent missing search results in Xcode!
I decide to try to narrow the problem down by doing a limited search on only the “bad boy.” I select the HackCFSocketStream.h file in Xcode’s files and groups tree, and do another search, this time selecting the “In Selected Project Items” scope-limiting option.
Son of a gun! No error dialog. It must be particular to doing a global search. I switch back to “In Project” and search again. No more bug. Gah! I seem to have “fixed” the bug by asking Xcode to limit its search scope to only the problematic file. I tried a few more problematic files, repeating the same identification method with gdb. Same results! Explicitly searching a selected file fixes the bug, apparently forever.
I tried to mix up the procedure a little bit. Instead of selecting the problematic file itself, I select the group that contains it. The bug still exists at this scope, but again, as soon as I select only the problematic file and search, the situation is resolved.
What I’m left with is a situation where I have a painstaking, yet effective method of ridding the project of this bug. Unfortunately, I have to wait until I encounter a search term that finds a problematic file. Then I have to identify the file and go in and do the magic cleansing search.
I suppose I might try to write a script or something that goes through and explicitly searches each file in the project. Will that clean things up? I should probably just revert to a backup project and hope for the best.
Update – More data points:
- It appears that it’s not searching the “problem file” but actually just viewing it in the Xcode editor that causes it to be “fixed.” I guess Xcode is probably reindexing or something when this happens. Rebuilding the index for the entire project does not fix the problem. So I guess I could work around this by writing a script to go through and open an editor window for every file in my project. We’ll see.
- A “problem file” is only problematic for a particular search term (or set of search terms, perhaps). For instance, a search of “text” in my source base causes a failure on a given file, but searching for another term present in that file brings it up just fine.
Update 2: The “fix” achieved by selecting an item is not permanent. It turns out that the error occurs only if the “bad file” is not currently open in Xcode. By selecting the file, it becomes open and stays open until explicitly closed. By explicitly closing the file (Cmd-Shift-W), I am able to reproduce the error again.
Update 3: Case Closed!
My curiosity wouldn’t let me leave this alone, and I finally tracked down the root cause of the problem. I have produced a very simple test project that exhibits the problem. If you’re curious, you can download the project here. Look at the single source file “TestSource.cp” for instructions on how to reproduce. I know enough about this problem now to summarize it in fairly brief form:
I tried to get it shorter than that, but I can’t! So what’s the bottom line? If your source file contains a multi-byte UTF8 character, Xcode mistakenly tallies the bytes instead of the characters when computing the offset into that line. In most cases, this just means that the contextual highlighting of the search result is slightly off, but if the “overshoot” caused by these multi-byte characters causes a search result range to exceed the end of the line, then it causes the very unsavory runtime exception I’ve described in this article.
This is better demonstrated by example. For instance, in the test project I’ve linked to, the UTF8 encoded source file contains a line like this:
Each of those bullets surrounding my initials is encoded in the file by the UTF8 bytes “e2 80 a2”. So what happens if I do a search for “Bullets” in this project? If the file is open, I get the expected result:
But if the file is closed, Xcode is not so savvy and interprets the file apparently on a “byte == character” basis:
See how the emboldened portion of the result is exactly 4 characters “too far?” That’s because the two bullets take 6 bytes, and Xcode assumes that two bullets should take only 2 bytes. The difference is the “slop” that gets inappropriately added into the range.
When the slop in the range takes you beyond the scope of the line, you get an exception. If I search the project for “danger”, I observe the edge case, beyond which no further range can be safely returned for the given line:
Searching for “dangerous” is just that. You’ll end up witnessing one of Xcode’s friendly error dialogs.
In summary: any UTF multi-byte character in your source file steals “safely searchable” characters from the end of the line. If any search in your project yields results that are at the ends of lines with multibyte characters, you will likely witness an exception and fail to see your search in the resulting pane.
Workaround: Save your source files as some other encoding than UTF8. I don’t really know which encoding is best. I guess I need to figure this out. For this particular project, which had become a mish-mash of UTF8 and MacRoman encoded files, I decided to switch all the UTF8 files back to MacRoman. I won’t be able to initialize UTF8 strings with non-MacRoman typed constant values, but at least I’ll be able to search my project in peace!