In both cases the OS X and Windows specific code does not amount to much (a few hundred lines each, with plenty of spacing). Doing the Windows code was easier in some respects, as I knew what I was trying to achieve. Though not being much of a Windows programmer it was harder in some ways too.
Anyway, just thought I'd share another bug I ran into when working on the Windows code. It's pretty similar to the garbage collected callback problem I had before, but subtly different:
overlap_completion=LPOVERLAPPED_COMPLETION_ROUTINE(completion_callback) # do async reads until the thread stops while self._running and self.is_open(): result=ReadFileEx(self._device_handle,report_buffer,len(report_buffer),byref(overlapped),overlap_completion) if not result: raise RuntimeError("ReadFileEx failed") Kernel32.SleepEx(100,1)
Kernel32 is a ctypes object for kernel32.dll and ReadFileEx is a method from that object, that has had it's prototypes etc specified in prior code.
The above code (with a lot of the context removed) basically performs overlapping reads on a separate thread. This is so I can receive data from a device, at the same time as writing to the device. If the IO is not done in an overlapping style (e.g. using ReadFile instead of ReadFileEx and not specifying FILE_FLAG_OVERLAPPED when opening the file using CreateFile) the read and write calls would become serialised and only one can happen at a time - even if the are running on separate threads.
Anyway, that code worked fine for some basic tests. I was able to read fine and my callback (completion_callback defined elsewhere) was getting called. However when I ran my unit tests to give the code a good work out I would (usually) end up with either Python freezing, a MemoryError being raised or else a straight crash with Windows prompting me to send a bug report. Hmmm. That's pretty familiar.
So I checked my code to make sure I wasn't doing anything wrong (not keeping hold of references to objects etc), but as far as I could tell it was all ok. Quite a few tweaks and partial rewrites ensued. Again I was happy that I could just issue a svn revert to get back to where I'd been.
Then I had a brief thought. My unit tests were constantly opening and closing the device and starting and stopping the thread running that code, but all from within the same process. So each time they were running they weren't necessarily starting from scratch each time. So ReadFileEx may well have been called before, with a now garbage collected callback. If Windows was then trying to call the callback that was garbage collected that would show the symptoms I was seeing.
A bit more thought and it occurred that I just need to stop the callback ever being called again, as the thread finished. So I simply added:
After the While loop to make sure that any pending IO request were cancelled (for the current thread), thus ensuring my callback would not be called again - prior to it getting garbage collected.