For users of the OPML Editor, the software Doc was using at BloggerCon to take live notes.
I met so many interesting people at BloggerCon this week, it’s impossible to tell the story of every one of them. Further, there were combinations of people who would not likely have met otherwise, who I saw talking and thought “Wow that’s power.” Anyway, at dinner on Friday night I sat near a couple of guys from a company in Bellingham, WA, who do all their work in Frontier. A 40-person company. I didn’t know such a thing existed. I suggested that we might fund some work on the Frontier kernel to make our servers run more smoothly, and they were immediately receptive. This led me to start thinking of projects I would commission.
It took a nano-second to know where the focus would be. Tools for debugging performance issues. Right now if you ask the OPML blogging community what their number one priority is, it’s getting the performance problems worked out on blogs.opml.org. Same if you ask me. I’m struggling to figure out what script is causing the flatlining behavior on the server. This has been the problem in Frontier for years. I’ve got very limited tools to figure this out, but with a little cooperation from the kernel, it could almost certainly dump the information I need, pointing me to the table that’s getting too large, or the script that’s looping infinitely. That we’ve been stuck here for so long is an indication of poor communication in the community, and a lack of incentives. The technology is very simple.
I’d like the kernel to maintain a log, in a text file, on the local hard disk, of exceptional events. I get to define, to some extent, what is exceptional. I’d like it to dump the addresses of tables containing more than 10,000 elements, when the table is initially brought into memory. thread.getStats is nice for stack dumps, but not much else. I really want to know how many cumulative CPU cycles each thread has used. If I dump this table every minute on a server that has performance problems, it would tell me which thread to look at for an idea of why the server is getting hung up. The key to debugging these servers is to reduce the number of places I have to look.