Wil Shipley posted a great article on handling a really really nasty bug reported by a customer. Under an extremely heavy, spiking support load, a customer reports an extreme case which reliably locks up his machine.
One of the nastiest bugs I had to deal with was when working on OS/2. During OS/2 Warp’s development, the window manager and related code had to be ported from a 16-bit mixed assembly & C source base to 32-bit pure C. A friend had done most of the conversion of the kernel-level mouse code, which is responsible for actually moving your pointer around on the screen and telling the application world about mouse movement. If memory serves, I finished up that conversion. Somehow there was a tiny error where if you left the mouse in the correct position (on one of every 8th row or something) then moved it up at just the right velocity, your mouse would go forever haywire.
It was common enough that everyone would hit this after a few minutes. But uncommon enough to be really hard to reproduce, even if somehow you knew how to reproduce it. The entire OS/2 development group was suffering from this nasty bug and it was up to me to figure out what was going wrong. At that time debugging was limited to stepping through assembly, no matter what high-level language your code was written in. And debugging a kernel mode mouse driver? Forget breakpoints. It took about two days of staring at the code, dreaming up all the possible code paths before I found the tiny logic hole responsible.