The latest ridiculous App Store power-play to make it into the public limelight is Apple’s alleged censoring of Ninjawords, an iPhone interface to a community-edited dictionary called Wiktionary. Before being approved, even as a 17+ rated title, the app’s developers were asked to remove specific words from the dictionary’s index.
(Edit 8/6/2009: Since I wrote this article, John Gruber received a response from Apple’s Phil Schiller. He paints a slightly different picture of the alleged censoring, and defends Apple’s intentions as being noble. I still maintain my theory below helps explains the capriciousness of AppStore review policies.)
John Gruber excoriates Apple for censoring a reference book. Gruber also discovered through an interview with a Ninjawords developer that Apple must have gone out of its way to locate words they could find fault with. Apparently the developers had been careful to prevent casual users from stumbling upon an offensive word, by preventing auto-completion for common vulgarities:
“In other words, the App Store reviewer(s) explicitly searched for curse words they already knew, and found them.”
I’ve been thinking about the capriciousness of the App Store review process. It’s ridiculous the kinds of rejections and hoop-jumping we’ve observed in the past year, and one has to assume that the issues making their way into the public eye are only the tip of the iceberg.
Then I remembered something from my own experience that might shed light on the situation. I started as a Quality Assurance tester back in 1995, in a small engineering group. Our group was diligent in the pursuit of finding issues that would embarrass the company or hurt customers. But we worked with larger groups whose motives seemed more oriented to the systematic evaluation they were receiving from their bosses.
These testers didn’t care how good their bug reports were. It didn’t matter if the software gaffe they discovered would save the company a million dollars, or a metric shit-ton of public grief. All that mattered was that the bug was “valid” and that the reporter was “first.”
I learned about the subtleties of this system through the ways that those testers interacted with me. Sometimes a bug that I submitted was determined to be a duplicate of an earlier report one of these testers submitted. If mine had more detailed information, it might be marked as the “original” bug, while the less informative bug was designated a duplicate. This worked great for those of us trying to ship a great product, but not so good for people who were fighting for their reputations in the metric-oriented testing groups.
Because our group was committed to shipping a great product, we were always convinced that bug reports with more information were superior. But the testers who were under the gun to produce new, unique issues, wanted credit for having uncovered these issues first.
As you can imagine, the “thirst for first” led to a significant number of ridiculous bug reports. If a tester could reasonably defend a bug report as valid, then it counted in their statistics, and made them look like a useful member of their team. My impression was that promotions and raises were directly linked to these statistics.
Many of the mercenary testers I encountered were motivated to scrape the system for bugs, as ridiculous as they may be. They logged them into the bug system and then defended them at all costs, as if their lives depended on it. And it turned out, they did. At least, their paychecks did.
I would not be surprised to learn that App Store reviewers are working under a similar structure. A system that rewards “unique, valid rejections” would certainly explain the behavior we have seen coming to light in the past year.
Why would somebody waste time typing profane words into a dictionary, gathering screen captures, and sending them to developers, except to defend their prize “catch”? If perfecting the product was the goal, we’d see a lot more nuance and thoughtfulness. But excellence is one goal, and collecting proof of “doing one’s job” is quite another. I think I know what many App Store reviewers aspire to.
Afterhought: It occurred to me shortly after publishing the above that App Store reviewers can’t be working purely under a “catch all violations” directive, because if they were, there would be numerous rejections based on UI guideline violations, and we’re not seeing as many of those (or are we?). I’m sticking to my thesis, but I suspect that the number of rejections we’re seeing on contrived issues like “you can find ‘cock’ in the dictionary” is because these are the easiest for reviewers to defend with Apple’s published guidelines. Whether a text field is aligned properly is a lot harder to challenge than whether “cock” can be interpreted as profanity.