Scripting News for 4/8/2007

Maybe we’re writing for Google?? 

Last month I went to Boston to be part of the Public Media conference, which I described to everyone I saw there as the NPR conference, even though most of the people there didn’t work for NPR and I knew it.

I was actually trying to make a point, one that otherwise would have taken a lot of words to express, but could be said simply if I was willing to look a little inept and uninformed. The point is this — the distinction between the different parts of the public media ecosystem are lost on people outside the ecosystem. I tend to think of it all as “public radio” — more today than in the past — and eventually, I think they will too.

Before the Internet, I listened to KQED. That meant listening to shows I wasn’t interested in, like Pacific Time or Latino USA. Now, after having lived in Seattle, Boston and Florida, I’m an NPR listener. I found shows on WBUR that KQED doesn’t carry. My favorite show comes from WNYC. I’m a fan of DIane Rehm who does her work at WAMU, but I first heard her on WJCT. I still listen to Fresh Air from WHYY, but I only listen to the podcast, and only when the program interests me.

In a few years, the transition to the Internet will be so complete that the link between the call letters and a local area will be meaningless. The stations won’t even broadcast. Then someone at NPR will swallow the hard truth that the distinctions mean so little to anyone outside their industry that they might as well just collapse it down and call the whole thing NPR.

Which brings me around to the lecture that my friends and colleagues in the blogosphere have tried to deliver in the last 24 hours to Mr Zell, the new owner of a bunch of big important newspapers.

It could be that Zell is brilliant, and is saying something that simplifies the truth to make a bigger point, and he doesn’t mind if you think he’s inept if some people get the bigger picture — which is he thinks of the Internet and Google as being the same thing, and you know what — I bet a lot of other people do too, and they have a point. Like the public radio stations, maybe we’re fooling ourselves if we think we’re not writing for Google, as they are fooling themselves into thinking they’re not creating for NPR. We want to cling to our theory that each of us is independent of the others, but what if he’s right, and it’s us vs them. What if his friends in the newspaper business decide they want to compete against us directly. What if my pointers into the LA Times and the NY Times stop working? Or what if he offers you a job to come write for his company so your pointers do work?

So stop and think a bit before you stop listening, and try to get beyond your impulse to dismiss him just because he said something that’s technically inaccurate. He could be smart as a fox.

The Sopranos and Entourage 

It was so great to see new episodes both shows tonight. I missed Tony and Carmen, Bobby and Tony’s sister (esp the story about her boyfriend, heh). I missed Eric, Turtle, Drama, and I hope Vince gets to play Pablo Escobar, and would you guys just forgive Ari! Lovely lovely lovely. I missed the whole thing. Can’t wait for new episodes of Big Love and my absolute fave, The Wire. I’m a total fan. Love, Dave

Apache internals 

One of the most intriguing comments came from Paul Ding, who suggests that the overhead of htaccess files may be too large a burden to bear and says that one could (clever!) use the file system to do what I was trying to do with the htaccess file. That may be true, but I want to know if Apache really reads and parses the htaccess file for every access. Is it not optimized to store the commands in an internal format and then check the mod date before re-loading and parsing the file? Either way, it doesn’t seem to make a difference on my server, whose performance monitor hovers near the baseline even with lots of commands in various htaccess files.

Sunday morning geekfest 

It’s been a really interesting weekend, most of it out of view of blogging. I have been continuing the project that involved Apache. I’m doing a static rendering of the Harvard site that hosts the RSS 2.0 spec. It’s going well, with the help of the community, it occurs to me that one of the things Scripting News could be is an online tech support workgroup for Apache.

I think we must all go through this rite of passage, the docs for Apache are so cryptic and inadequate. The design of Apache itself is weak. But it is workable, you know that eventually you’ll puzzle it out, and if you can find the right people to help, they can show you how to do what you need to do quickly and surely.

I’m lucky because the techies who read this site really know their stuff. I know how good they are, because when I’m hunting for answers to Apache questions, the best resources are discussion threads scattered around the web, where people like me asked questions of people like you, and got good answers. But I got more thorough and informed answers than I saw anywhere else, and most important, they explained the theory behind the solutions, so I could in turn pass on my knowledge later.

The cool thing about Scripting News has always been how smart these people are, how good-natured they are, and how they like to show off what they know! This is a very useful combination of skills.

Anyway, as of yesterday I had completed the exporting of the named pages on the site, they’re all linked into the index page on the new static site. These are, generally, the spec itself, the pages that link from the spec, the example files, and various documents announcing the transition of the spec from UserLand to Harvard ownership. This morning I’m working on exporting the blog posts. Then we come to the comments, and I think I’ll stop there, because there has been so much comment spam on this site, that after the technical work is done, comes the editorial work of deciding what’s spam and what’s not spam, and I’ve been very carefully avoiding questions that involve editorial judgement. My goal has been to turn over the content of the site, so the new rendering will be as future-safe as we know how to make a site in 2007. It’s been an incredible learning experience!

13 responses to this post.

  1. Hello,

    I have known your blog through Spanish newspaper “El País”:

    I want to congratulate you because your post of April 1,1997 was the first one. I have written a post in my blog.




  2. .htaccess files are indeed checked for every request – they’re the ‘shared-hosting’ way of letting you do rewrites without giving you the keys to the van.

    Performance-wise, you want to shove that stuff in your <VirtualHost> definition instead, where no such continual re-checking happens (but where you can’t normally edit anything without admin privs on the server).


  3. […”in the main Apache config file” was missing from that last para.]


  4. You know for a fact that it’s not caching the content of the htaccess files?


  5. Seconding John here – I know for a fact that it’s a reload every single time. That’s not that bad, though. Since it’s read often and modified rarely, chances are it lives in your file system’s cache. I.e. Apache gets it directly from RAM.


  6. I agree re: Zell. If Google doesn’t find you, you are effectively invisible. Yet, from Zell’s point of view, Google is distributing his content. (Not literally, but, again, effectively.) If other people pay him for that, why shouldn’t he expect Google to do the same? After all, if it wasn’t for other people’s stuff, Google would be out of business.

    On NPR: Your argument works, I think, if we ignore local content and the overwhelming requirement to feed content to people listening in their cars. The only way I can think of to do the latter is broadcast radio. (Podcasts require planning.)

    As for the former: Where I live — outside Ralrigh, NC — we have a major NPR station, WUNC, that produces national content and a lot of local content. I want that local content to survive, and I don’t think they’d find it economical to produce it if they knew it had to compete for a national audience. On the other hand, We also have a number of other NPR-affiliated university stations, two of which are jazz stations. Both are online, and I’d guess their allegiance to broadcast radio might hinge on whether or not most of their revenue comes from people who listen locally or who listen on the web. But, even there, there’s a significant local flavoring to a lot of their stuff.


  7. Vast numbers of people find newspaper articles via Google. So, for a newspaper to say pay us or don’t carry us,, well, they may get their wish. Then they will watch their web numbers drop.

    Zell wants to win through litigation, He has no new ideas.

    Newspapers need the web far more than the web needs newspapers.


  8. Re: Zell. Dave, If he had articulated the theory you are now presenting: “Is what we’re all doing really merely ‘working for Google’?” I would have responded to that. However, he said something else and that’s what I responded to. Like you suggest, I will continue to listen to what he says in the future and will respond to your, more philosophically challenging, theory, if he re-casts his comments in your fashion.

    I’m sure (from all I’ve read today in response to this mini-controversy) that he’s a very smart man.

    I think he believed that he has recognized value in deciding to put together this deal (or, I guess it could be one of those decisions rich guys make when they purchase professional sports teams)…And I’m sure the role of Google on the Internet was not a big factor in his decision — and I’m sure the quote that we’re making such a big deal about was merely a throw-away comment in response to a question: not the central part of the speech he was making.


  9. The issue for me is that the current profit model for news content on the Web is simply not sustainable. Google, without producing a scintilla of content on its own (and paying very little for any), makes billions of dollars off of content. Meanwhile, newspapers, which provide the seed content for most news on the Web, find their business imperiled. If newspapers, or more accurately news providers (taking paper out of the equation), don’t figure out a way to fund journalism, Google’s own business falters. Because people don’t want to search press releases and promotional material and content spam, they want good info. Good journalism costs money. Who will pay for it?

    Dan Gillmor and J.D. Lasica smacked me down at a media presentation recently (long before Zell’s epiphany) when I asked about this disconnect and suggested that at least some of Google’s billions ought to go to funding journalism. The usual interpretation is that Google is perfectly legit in searching & linking to newspaper content, but that is not my point. Dan did note that Google out of the goodness of its heart could choose to fund journalism, but I see a huge selfish interest in that for Google as well. Maybe Zell does too.


  10. Posted by Jeremy on April 8, 2007 at 7:00 pm

    Re: .htaccess

    The Apache documentation confirms that .htaccess is loaded every time the directory is accessed. Checking the source would confirm this absolutely, but I think the documentation is pretty clear:

    “the .htaccess file is loaded every time a document is requested.”


  11. Whether we like it or not, there will always be a ‘them’ and an ‘us’.

    It’s all down to ‘how you roll’ in many ways.

    ‘Our’ voice is so much stronger these days.

    Worrying about the ‘mainstream’ media is really a waste of your energy. (Though I admire your tenacity)

    You have an extra-ordinary superpower. You’re a hero.

    Use ‘your’ force.

    Clean up OPML 2.0 😉


  12. Posted by heavyboots on April 9, 2007 at 12:16 am

    re: htaccess

    Dave, on today’s machines you’d probably have to rewrite A LOT of stuff to see any effect. I didn’t understand this read-the-whole-access-file for every access rule back in 1999 when I implemented my first “real” website. I had a basic auth file that initially started out with a couple hundred names in it. Then suddenly the site got popular and before you know it poor apache was dealing with a 10,000 user unsorted list. o_O

    At that point, yes the server load became a serious issue (loads of 6-10 on a circa 1999 machine!), but luckily I was busy rewriting everything in PHP/MySQL at that point anyway after concluding I hated-loathed-despised Perl, so it all worked out in the end…


  13. Agree re: HBO series. Even when they start to disappoint or don’t measure up to themselves it’s still the best TV and better than most movies.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: