Don’t Speak Twice, It’s All Right

July 17th, 2006

Andy Lee sent me a bunch of excellent feedback about FlexTime, and let me know about a strange, 100% reproducible crashing bug. If you configure FlexTime such that both the ending cue of one activity and the starting cue of the one that follows are “Speak Text” cues, then the application crashes.

First thought: damn I’m glad I put a beta out. Second thought: good lord, what I have done!?

Unfortunately, the bug is not in my code. I was able to reproduce the problem quite easily with the simplest of command line tools:

#import <Carbon/Carbon.h>
main() { Str255 string1 = "\\pHello"; Str255 string2 = "\\pHello Again";
SpeakString(string1); SpeakString(string2); }

You may not have realized that it was quite so simple to accomplish spoken text on a Mac. Unfortunately, the simplicity is deceptive, since compiling and running the above tool results in a nasty crash:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000000
0x0020da94 in MTBEAudioUnitSoundOutput::BufferComplete ()
(gdb) bt
#0  0x0020da94 in MTBEAudioUnitSoundOutput::BufferComplete ()
#1  0x7006e9bc in AUScheduledSoundPlayerEntry ()
#2  0x700097c8 in DefaultOutputAUEntry ()
#3  0x700049d0 in DefaultOutputAUEntry ()
#4  0x700da1a8 in dyld_stub__keymgr_get_and_lock_processwide_ptr ()
#5  0x90bd9d24 in CallComponent ()
#6  0x942647a0 in AudioUnitUninitialize ()
#7  0x94178368 in AudioUnitNodeInfo::Uninitialize ()
#8  0x94178300 in AudioUnitGraph::Uninitialize ()
#9  0x941786a4 in AudioUnitGraph::Dispose ()
#10 0x941785f0 in DisposeAUGraph ()
#11 0x0020e0a8 in MTBEAudioUnitSoundOutput::~MTBEAudioUnitSoundOutput ()
#12 0x0020a78c in SpeechChannelManager::~SpeechChannelManager ()
#13 0x002215f4 in SECloseSpeechChannel ()
#14 0x91997974 in KillSpeechChannel ()
#15 0x91995088 in KillPrivateChannels ()
#16 0x919969d8 in SpeakString ()
#17 0x00002cec in main ()

SpeakString has been around for a long time. Long before Mac OS X and long before CoreAudio, where the crash appears to be happening. I would guess it didn’t used to crash, but when it was ported to Mac OS X, something got overlooked and now it leads to whammy land.

OK, so I how do I work around the problem? It is clearly related to attempting to speak text while some text is already speaking. Maybe if I could coddle the Speech Manager a little bit, I could prevent it from crashing.

From the Speech Synthesis Manager Reference documentation for SpeakString, we see that the behavior for overlapping speech is (supposed to be) very well defined:

“If SpeakString is called while a prior string is still being spoken, the sound currently being synthesized is interrupted immediately. Conversion of the new text into speech is then begun. If you pass a zero-length string (or, in C, a null pointer) to SpeakString, the Speech Synthesis Manager stops any speech previously being synthesized by SpeakString without generating additional speech. If your application uses SpeakString, it is often a good idea to stop any speech in progress whenever your application receives a suspend event. Calling SpeakString with a zero-length string has no effect on speech channels other than the one managed internally by the Speech Synthesis Manager for the SpeakString function.)”

Translation: what we’re doing is supposed to work. But maybe by overdoing it we can achieve the desired goal. If Mac OS X falls down on the “interrupting immediately” behavior, perhaps we can manually stop any previous sound to help it keep its bearings. According to the documentation, calling “SpeakString(NULL)” should effectively cancel playback. Unfortunately, injecting it into my simple crash case changes nothing. Worse, when I add it to my live application, I observe a new failure path. The text “pure virtual method called” is printed to the console, with the following backtrace:

(gdb) x/s $r4
0x52c1ad8 <_zn24mtmbraisedsinecrossfader7scoeffse +5916>:
"pure virtual method called\n" (gdb) bt #0 0x90014ac0 in write () #1 0x052b2814 in MTFEClone::VisitCommand () #2 0x05287604 in MTBEWorker::WorkLoop () #3 0x052866f4 in MTBEWorkerStartMPTask () #4 0x90bc1900 in PrivateMPEntryPoint () #5 0x9002bc28 in _pthread_body ()

Well, this can is getting wormier and wormier. It is starting to look like I won’t be able to take advantage of the ease and simplicity of SpeakString. Ten years ago, sure. But in 2006 SpeakString es muy sucky. It’s probably time to start looking at the more advanced speech API, where I’m responsible for managing my own speech channels. With responsibility also comes (we hope) the ability to save ourselves from certain doom.

But let’s say I just need to stick with SpeakString, because I have a demo in 5 minutes, or users are just screaming bloody murder about this bug. There is a crude workaround that takes all the asynchronous fun out of speech, but also prevents the crash. By explicitly waiting for the Speech Manager to be done with any previous speech, I can prevent it from maiming itself:

#import 
main() { Str255 string1 = "\\pHello"; Str255 string2 = "\\pHello Again";
while (SpeechBusy()) { ; } SpeakString(string1); while (SpeechBusy()) { ; } SpeakString(string2); while (SpeechBusy()) { ; } }

This also “works” in FlexTime, for some definition of “working.” But it can cause hideous stalls in the playback UI, since I’m blocking there for an indeterminate length of time. Passable in a beta release, but not acceptable for a finished product.

Sigh. I’m going to have to do real work. But you don’t have to. RSSafeSpeaker is a simple singleton class designed to make worry-free overlapping speech easy for the Cocoa programmer. Instead of trying to manage a number of open speech channels, this class takes the approach that it’s “good enough” to just allocate and deallocate a channel for every speech made. Obviously for some purposes this will not be suitable, and you’ll want to manage a pool of open channels. For the “everyday, get this done easily” use though, I hope you’ll find this class handy. Rewriting our previous example using RSSafeSpeaker:

NSString* string1 = @"Hello";
NSString* string2 = @"Hello Again";

[[RSSafeSpeaker sharedInstance] speakString:string1]; [[RSSafeSpeaker sharedInstance] speakString:string2];

No crashes! And I get to use NSString. Everything is better. This is a good example of a situation where the shortcomings of Apple’s API caused me grief and made me go to a lot of extra work. But it’s also an example of such a situation where the extra work won’t be for naught. It’s a good idea for me to use the “deeper” speech APIs, because it’s inevitable that I’ll want to have finer control over the playback effects in my application. It was just a lot easier to choose “SpeakString” as the quickest solution. If anything else persnickety comes up, I’ll be in an excellent position to respond quickly and effectively. All in all, time well spent!

Oh, and in case anybody was worried, I did report the crashing bug to Apple (rdar://problem/4633582).

Update: Oh man, don’t I feel like a dork! I somehow missed the presence of NSSpeechSynthesizer, altogether. Thanks to Jim Correia for pointing it out to me via email. It does seem to work, and doesn’t crash. Of course, now that I’ve got the infrastructure in place, I might as well keep using it, since it will ultimately give me more control over the playback options. But NSSpeechSynthesizer does seem a better choice for most purposes.

It looks like each NSSpeechSynthesizer corresponds with a “speech channel,” so if you actually want to overlap voices (instead of just causing the previous speech to be canceled), you’d need to allocate multiple speech synths (similarly to how my RSSafeSpeaker allocates a speech channel for each request).

Thanks again to Jim for sharing this! I am embarrassed to have overlooked it…

The Quiet Mac

July 14th, 2006

This post is part of my MacBook Pro Complaints series. Instead of (or in addition to) linking directly to this post, consider linking to the series link, which includes a summary of all findings to date and direct links to the pertinent downloads that users may find useful. Thanks for reading!

It is with a great sigh of relief that I announce the likely end of my MacBook Pro saga.

As recently as a few weeks ago, I expected that the end of this saga would take the form of finding a buyer on Craigslist to take the thing off my hands, but events turned in a more positive direction as I patiently worked my way through the Apple support system.

Eventually I ended up in the hands of a thoughtful representative with a lot of discretion over how best to handle my situation. The amount of bad luck I had endured really accumulated over the past few months. The defects themselves. The repeated failures to fix them. The bad Apple store rep. The slightly mutilated MBP case. Oh, I might have forgotten to mention that. On my last repair shipment, the techies must have put it back together in a hurry, as they left a gap in the case where it meets the DVD-ROM area. I mentioned this in passing to one rep, where it was apparently added to my record.

Finally, after apparently reviewing the list of everything wrong with the Mac and my experiences so far with Apple, the representative offered to send me a replacement. I was wary, at first. Just go through this all again? I’d rather have somebody good at Apple take an honest look at it and replace the parts that are busted. But then I heard rumors of the new MacBook Pro logic board, and couldn’t help but hope that this meant a happy outcome was in sight. I asked my Apple representative if a replacement would be “fresh off the line.” I didn’t want some replacement from March or April, that happened to be sitting in some Cupertino stockpile (as if). He could only say that they tended to be shipped out as fast as they could be made. So I agreed – let’s spin the wheel.

The new MacBook Pro arrived this morning and what can I say? The thing is dead silent. You can trust me, I know what the freaking pain sounds like. Yes Virginia, there is a quiet MacBook Pro. Impressively, I could not even hear a noticeable sound difference with my ear pressed up to the machine while tweaking the slider on QuietMBP. Actually, after plugging the power in I can hear the slightest “sizzle” if I put my ear to the plug where it’s plugged in. And this might be only while it’s charging. Anyway I don’t think it’s loud enough for me to hear from any normal distance, and I bet this amount of sound is present in almost any laptop, ahem, notebook. So after months of negativity it feels good to return to my initial state of just being impressed with the MacBook Pro. Just about everything is “amazingly right.”

Even the surface temperature is slightly cooler than my old MBP. I’m transferring over from the old one right now, so both machines have been turned on the same length of time, and roughly (I assume) doing the same amount of activity. The old MBP’s “hottest spot” (above the function keys) is still so hot that I cannot leave my fingers there for more than a few seconds. The new one is still hot there, but I can leave my finger there indefinitely. Major improvement, even if ideally they could cool things down even more. To be honest, the heat is still frustrating, especially in the summer, but if it’s just the heat, then I’ll shut my whining mouth for a while.

I know at least some of you have been holding off on a new MacBook Pro because of things I’ve said about them here. I’m happy to have helped you avoid a costly and possibly frustrating mistake. But I don’t want Apple to lose customers, goodwill, or sales. I’m a stock holder, for crying out loud! Now that I can happily endorse this (please, nothing go wrong in the next few days), I say get out your credit card and join us!

Update: I’ve been using the new MBP for several hours and my inital impressions are holding up. It’s quiet as a mouse. I love it! But just to show that I’m not completely devoid of crankiness, I have two minor complaints. First, the space bar makes a slight squeaking noise every time I press it. This is definitely new, but it’s so much nicer than the whine. Maybe I can fix it with a little well-placed graphite or something. Second, the lid closes in a way such that unlatching it is slightly clunky. I have to sort of press the button in extra far to get it to pop open. Overall, very minor inconveniences. How minor? There’s no way in hell I’ll risk ruining the happy state of my MBP right now by bringing it in for these teeny issues.

FlexTime Nearing 1.0

July 12th, 2006

Over the past several months I have been putting a lot of work into FlexTime, the project I first announced here back in late December.

The product has been massively (and sometimes drastically) improved since that time, and I’m happy to announce that a new public beta is available for your download, critique, and hopefully enjoyment:

Download FlexTime 1.0b5 (expires two weeks from today)

Note: FlexTime requires Mac OS X 10.4 or later. FlexTime is a universal app.

What is FlexTime?

FlexTime is a generic timed routine scheduling application. Can you tell it’s hard for me to figure out how to summarize it in one sentence? Basically, it makes it easy for users to program complex time-sensitive scheduled activities, where it’s useful to be reminded at regular intervals that it’s time to “move along” to the next activity.

FlexTime turns your Mac into a hard-assed training coach for whatever it is that you do.

Examples of things you might use FlexTime for:

  • Manage the work/play/break ratios for the time you spend at the computer.
  • Practice a stretching or martial art regimen such as Yoga or Tai Chi.
  • Set up a metronome for rhythmic exercises such as dance or music.
  • Arrange for scripts to be run at regular intervals throughout the day.
  • Just about anything that follows a schedule!

I’d love to get feedback about all aspects of the application. For the most part the UI is pretty fixed for the 1.0 release, but future enhancements will undoubtedly bring changes.

Most of all I’d love to hear about any uses of FlexTime you come up with that aren’t on my list! I think the success or failure of this product will be in finding specific uses that resonate with the market. It’s possible the market will reject it for its generic-ness. In other words, a customer who might buy “Yoga Stretcher” could just walk right past FlexTime. But I didn’t want to sell a yoga app at the expense of being useful for hundreds or thousands of other people with different interests.

Caveats

This is a beta release and therefore I have a list of caveats. These basically correspond to the “still needs to get done” list in my project. Hopefully mentioning these here will head off criticism of some of these shortcomings:

  • Documentation is not written yet. Yeah – it should have been done incrementally. I’m bad!
  • Scripting support is not complete. Most of FlexTime’s guts are accessible via AppleScript, but I’ve hit a stumbling block on implementing access to setting the cues via scripting. The difficulties lie in the generic, untyped nature of the cue type. It can be just about anything, depending on the type of cue handler.
  • Per-document UI dimensions are not saved with document. This means if you set up a FlexTime routine’s window size and table columns to look just perfect, it will look crappy again when you reopen it.
  • Document format still in flux. I’m still tweaking the document format, but I’m leaving in “upgrade” mechanisms for all beta releases. With the public 1.0 release, I will maintain all of those upgrade mechanisms, but afterward they will be stripped out. This is just advance warning that if you use FlexTime now, be sure to open and save any important documents once 1.0 comes out. That will be the “official format” from that point forward.
  • Document icon is generic. For 1.0 I will make a “branded” document icon.

The Tough Love of 1.0

In whittling down the feature list of this 1.0 release, I had to make a lot of tough choices. Lots of “would be cool” things are not present, though planned for a future release (assuming anybody likes the product). So perhaps to tease you and perhaps to head off another category of feedback, here is a list of where I see the product going post-1.0:

  • Multiple cues at once. I know it’s very frustrating that you can’t, for instance, both display a sound and show a message at the same instance. To some extent this can be “hacked” in 1.0 by using “0 seconds” long activities, but it’s definitely at the top of the list for future improvement. This is mainly blocked now by the disruption to the UI that such a feature would cause.
  • Export to iTunes. I’d really like to be able to take FlexTime’s audio (and perhaps visual) cues “on the road,” by sending the output to a media file that iTunes can understand and pop onto your iPod.
  • More cue types. FlexTime 1.0 supports a number of very useful cue types, but the possibilities here are endless.
  • Growl integration. FlexTime includes a light-weight “show text message” functionality, but I’m sure some users will appreciate a feature that forwards such requests on to Growl.
  • Printing support. By printing a pretty view of the entire routine schedule, FlexTime could be useful in scenarios where not everybody being cued has access to the video screen.

One Last Question

Before I leave you to try out the program, and open the floodgates for criticism, let me ask one question: What do you think of the word “cue?” Should it be something else instead, such as “action” or “event?” This word choice is a very tough one for me and I’m very open to feedback (reasoned, preferably!).

Thanks for trying FlexTime!

You Own It

July 10th, 2006

One of the rumors buzzing around the internet this past week is that Microsoft is working on a tough competitor to the iPod. Oooh! Shiver me timbers! The chances of Microsoft taking Apple down in the portable music arena are so miniscule that even John C. Dvorak thinks it’s impossible.

I heard him say so on today’s episode of This Week in Tech, one of the very best (and most popular) podcasts available. Another purported impossibility had to do with associated rumors that Microsoft was planning some kind of “buyback” plan for iTunes customers. The idea is that as a lure to switch to their service, Microsoft will offer to give you for free Windows Media versions of some number of songs from your existing iTunes library. The consensus seemed to be skepticism that Microsoft could even figure out which songs were the ones you had bought.

What immediately came to mind for me was the AppleScript interface to iTunes which, while pretty weak in some regards, still exposes quite a bit of information about the users’s music library. I was surprised the idea didn’t occur to host Leo Laporte, because I’ve heard him express a fairly high level of knowledge about AppleScript in the past. As a proof of concept, I’ve put together a simple script application. You Own It presents a list of all the purchased music from your library. If you’re at all concerned, or just curious, about what it does, just open it with Script Editor and read the script code yourself.

The crux of this functionality is based on a single iTunes AppleScript request

every track of library playlist 1 whose kind is “Protected AAC audio file”

I’m sure there are some loose ends here, but if Microsoft really wants to do this, it won’t be hard for them to do it right, or at least 95% right. Anyway, if they’re going to be giving out free songs, chances are they don’t really care if the songs are actually ones you bought from iTunes, or not.