Scripting News for 3/1/2007

S3 bill for February 

I got the bill for my S3 usage for last month, the first month I used S3 to replace real deployed servers.

It served the archive of all my old DaveNet essays, Scripting News story pages, and all my podcasts, including some new ones.

The total came to less than $40 for 190.530 GB transferred. Seems like a good deal, it’s worth going forward.

Preserving ideas 

No one really likes to think about dying, but it comes for everyone, eventually, and if you’re living a creative life, as so many of us are these days, maybe you’d like your creations to live at least a little bit longer than you do? Look at it another way, suppose there’s a James Thurber, Mark Twain or Truman Capote or George Harrison among us, wouldn’t that person likely be creating on the web, and shouldn’t their work last longer than their own lives?

A few months ago, I decided to start learning about this, I realized that if I were to die now, my web presence might last a month or two, but probably not much longer. Part of my life consists of watching the servers, rebooting them as necessary, clearing out folders containing backups, all kinds of maintenence that my heirs wouldn’t know how to do, and probably wouldn’t want to do. If I want these things to last, I realized, I would have to invest to future-proof the content, as best as I can.

Now my work is probably a bit more fragile than most people’s, you may store your blog at Blogger or LiveJournal, where other people are doing the maintenence, but if you read the user agreement covering your site, what responsibility do they have to keep your site running? You might lose everything, even while you’re alive, and have no legal recourse. I store movies at blip.tv and pictures at Flickr. They sure are convenient, but how do I know they’ll exist in two years or ten? It seems a long shot that they’ll be there in 50 or 100 years.

Then there’s archive.org, which is a very great service, and a good backstop against the failure of our frail systems, but it doesn’t do enough, though it’s pretty close. I’d like there to be a way for me to actually map the domains they’re archiving to point into their space, so links into my domain wouldn’t break if we had to rely on their backup. In that case, there might be a part of my will where I leave $100 to archive.org or $1000, to do the domain transfers it would take so that the links into my sites won’t break. As far as I know they don’t now offer such a service, so it would be virtually impossible for me to request it in my will.

Another question — is archive.org permanent enough to trust with the backups? Over what period of time? Am I willing to accept the limit, that after my demise, my work will live as long as archive.org? I’d rather put my faith in a more long-lived entity. At a breakout session at the Beyond Broadcast conference with Harvard professor and mentor Charlie Nesson, he suggested perhaps Mount Auburn cemetery might be permanent enough. Interestingly I had thought of Mount Auburn, but I said that I would prefer if Harvard, a university that’s been around since 1636, were to sponsor this service, perhaps in conjunction with its vast and highly respected library. Harvard, partnering with archive.org, now that’s beginning to sound like something a lot of people would trust with their intellectual and creative legacy.

In fact, I’d propose that this would be a venture that Amazon, with their excellent S3 service, that’s become so popular with developers, may wish to lend its good name to as well. And Google, Apple, IBM, Microsoft, Sun, the EFF, Stanford, Larry Lessig, you name it, the more organizations and trustworthy people helping the better. I’d like to encourage archive.org to implement the same API as S3, and I’d like to encourage Amazon to let them. If there’s any stickiness, let’s get everyone in a room with Charlie, whose such a pleasure to listen to, it’s hard to imagine anyone saying no to him! :-)

At breakfast this morning, Jeff Ubois, who has made this area his life’s work, noted that it seems a lot of people are thinking about this now. Indeed, they are. There are some huge ideas here. Why now? Well, after almost ten years of blogging, there might be something worth preserving. Being 51, and having survived a life-threatening illness at 47 makes me aware that there’s no time like now. I’m already caring for the archive of my uncle who passed in 2003. What will become of his blog when I pass? To have it disappear then is simply not acceptable. And as a software developer, I want to be sure I have answers for less technical people, for the Dostoyevsky or Huxley among us, for the Picasso or Chagall, for the Ives, Copland and Berlioz. We believe digital is better, but how will people know how we lived two or three generations from now? That’s a problem I want to work on.

25 responses to this post.

  1. Dave: on the note of archiving, I’m wondering if you’d have any objection to me creating and distributing Palm Plucker files of the archive of Scripting News and/or DaveNet. Basically, they’re downloadable PDB files containing a set of offline HTML files, with the relevant images that go with them. A Palm Pilot user can then download them, store them on a SD memory card and read them offline.

    I’ve been reading them month at a time in spare minutes – on the bus, train, while waiting in line etc. It’s certainly a great experience to read. I actually started using the Internet in 1996, so reading the archive of Scripting News from 1997 onwards has been most interesting. It’s amazing how many of the posts are still as relevant today, mostly because the same damn silly things keep happening, albeit with different names and different faces.

    With podcast archiving, I keep a detailed backup of almost everything I listen to. I’ve just popped open my archiving application (CDFinder), and found that I’ve got a lot of MCN, a few appearances of you on other shows (Rocketboom, CalacanisCast) and all the audio from BloggerCon III and IV. All can be found in my personal archives just by using “winer”, “scripting”, “mcn” and “bloggercon”, along with masses of other podcasts.

    As Cory Doctorow likes to say “bits are only going to get easier to copy”. That’s something that makes me happy, but scares a lot of people absolutely shitless.

    Reply

  2. Posted by mark wilson on March 1, 2007 at 2:16 pm

    I like how you compare your footprint on this earth with Mark Twains and George Harrisons. Does your ego know no bounds?

    Reply

  3. Tom, I’d prefer if you didn’t distribute them, or send me an example of what they look like so maybe I can make my CMP spit them out.

    Reply

  4. One problem, for someone as spread out on the web as you, would be to keep paying for all the domain names.

    One way would be to grant $300 per domain to a trust, so the interest would pay the hosting indefinitely (although inflation would bring the price over $9/year after a while).

    Another option would be to forget about keeping everything alive on its original domain, and instead build a big static HTML archive of all your stuff, with an index pointing to each page, then put a big note at the top of each page saying “This page was originally published at http://archive.scripting.com/2007/02/28/whatever“. Then you could store multiple copies of the archive around the place, and people could find your posts with Google, or whatever is the dominant search engine at the time. Your domains would disappear, but your content would remain as long as *someone* hosted an archive of it.

    Reply

  5. Mark, it’s cool that you like it, but where do I compare my “footprint on this earth” with that of the people you mention?

    Reply

  6. That’s a good point Phil, and one that hasn’t yet made it into the queue of issues to look at. Is there some amount of money ICANN will accept for a perpetual license to a domain name. Or does that even make sense.

    Reply

  7. Dave,
    I’m interested in what sort of bandwidth that amounts to in total/average. Thats’ cheaper than I rent my server for, which hosts many things/domains/systems/scriptdoodles

    I really like what Amazon are doing, and once I get the issue of organising podcast feeds and evangelising opml out of the way, and on a roll, then the hosting issue will be my next issue.

    there are huge opportunities for partnerships at this point for so many systems. suddenly i could all *work*.

    check out Libsyn just finalised their acquisition. ;)

    I think that Usenet binaries could be used for 30 day limit (usually) distribution. That would at least provide plenty of seeds out there to start things off like bittorrent. You can connect to usenet using IMAP.

    Maybe then Tom could move the content from Usenet to Archives – as long as he can change the url on the original enclosures. ;)

    I think when all the issues of organisation, discovery, navigation, distribution (which opml amd rss do) are sorted and ‘standardised’ as much as they can be, then we might be well on the way towards a truly semantic ‘web’ of content – of whatever type.

    rockin’. ;)

    Reply

  8. btw: alt.binaries.podcast does exist on Usenet

    Not many people post to it. But some have. I’m pretty sure PHP could handle the IMAP transfer after a file upload, so it should be straight forward for someone with the time to write it.

    What’s need are clients which support nntp/imap , which have been around for ages.

    In fact, isn’t that what Guba.com is based on?

    Eventually I’d like to set up a ‘lab.podcast.com’ to attempt to deal with these issues. All with a discussion forum system I’m devising which will has opml at the heart of it.

    I call it ‘phreadz’ – phat threads – opml plus rss to the power of users

    thx Dave

    Reply

  9. It seems like at least part of the answer is a distributed system. Harvard and other institutions are impressive in their logevity, and are certainly well qualified for preserving physical artifacts. In the virtual realm though, why should we have to rely on any one institution. Bits can be duplicated easily, after all.

    Reply

  10. Check out Mission Eternity, the new etoy.com project.

    Reply

  11. Posted by Jim Posner on March 1, 2007 at 4:28 pm

    Books are always a good way to store data and have been for a thousand years. Chances are they will be for another thousand.

    Reply

  12. I dunno if I like the idea of ICANN letting people pay for domains in perpetuity. Seems like it would be a great way for domain squatters to fence off large portions of the domain namespace. Certainly there’d be more of those types of people doing it than people legitimately interested in long-term archiving of data…

    Reply

  13. and one more thing, I agree with Jim Posner’s comment above, at least to some degree. I think your definition of “perpetuity” is too limited. So far there’s only one method that’s ever been conclusively proven to work for archiving messages in perpetuity: chiseling them onto stones and burying the stones in the desert — and even with that, we’ve probably lost all but a tiny fraction of the work that was ever archived that way.

    Maybe I’m just a pessimist :-)

    Reply

  14. A very Dave Winer idea from BusinessWeek, saying Apple should use $1b of its cash to fund Mac startup companies:

    http://www.businessweek.com/technology/content/mar2007/tc20070301_402290.htm

    I have no connection to the story, saw it on the Fake Steve Jobs blog.

    Reply

  15. Posted by tim on March 1, 2007 at 5:18 pm

    “Look at it another way, suppose there’s a James Thurber, Mark Twain or Truman Capote or George Harrison among us, wouldn’t that person likely be creating on the web”

    No. Take the example of William Blake. His greatest works were largely unknown in their time and it took many decades to develop a strong following.

    Part of his ability to last was handcrafting his own work and providing it only in limited (sometimes 10s sometimes 100s of copies) editions.

    Producing high quality, truly crafted work can last longer if it gets discovered by just one person than smearing crap all over the web and hoping that by archiving and duplicating it gets to people.

    Reply

  16. Posted by Jim Posner on March 1, 2007 at 6:47 pm

    Books give you a proven voice to speak to the future. Blogs not so much.

    Reply

  17. Dave,

    Thanks for a nice, thoughtful post.

    Maybe 12 years ago I organized a symposium on digital photography for the Ansel Adams Gallery in Yosemite and Carmel. One of the speakers was the curator for photography at the Los Angeles County Museum, Robert A. Sobieszek. He asked at the time if it was the responsibility of the museum to collect monitors as well as the works, so that the work would be represented as originally created.

    If you get Amazon, Harvard and archive.org on board, you may need to get AOL to replicate their dial up service of 1995 and apple to give some old systems so that the experience is truly as it was.

    Jeff Tidwell
    Midpines, CA

    Reply

  18. Dave, I’m not sure the technical side of how Plucker files are produced. I use Sunrise Desktop – a Java application that I have installed on Linux – to basically pull a webpage (I use the ‘year’ page on htmlarchive.scripting.com). It then pulls in that page and each of the monthly pages, renders them up and turns them in to a PDB file. Browsing pages in Plucker is very similar to browsing pages in a web browser, except they’re stored on the memory card (or other storage) on the PDA.

    I’m not sure that there’s a particularly easy way of showing this. I pulled one of my experimental files – Scripting News for 1999 – loaded it on my Palm and snapped away. The quality isn’t great, but you can sort of see the user experience (which isn’t that different from a web browser).

    The picture I forgot to shoot was one depicting the way that the Palm can be used landscape as well as portrait.

    Even so, here is the set of photos of SN 1999 on Plucker on my Palm TX. Start at the first one and browse through (they have descriptions of some of the UI, so the Slideshow is best avoided):
    http://flickr.com/photos/tommorris/sets/72157594564744329/

    plkr.org has some screenshots:
    http://www.plkr.org/gal

    Reply

  19. Good idea.

    You might try something like http://www.ljbook.com/ to export your blog to a book in pdf format. Might be a piece of the puzzle or short term solution. It uses XML-RPC to export your blog entries.

    If a standard emerges among service providers perhaps they could sort of “RAID” the content among themselves so if one drys up the others could still persist the content. Or more grassroots you could do it p2p style. Just thinking out loud, maybe storing a torrent like seed on a provider and only storing more info there if there wasn’t a sufficient number of peers housing the content…

    Reply

  20. Posted by billg on March 2, 2007 at 7:40 am

    Thinking about the really long term, isn’t it true that ink on quality paper is liable to last longer than any digitized medium? I’m under the impression that data on things like CD’s and hard drives will become increasingly unreliable over time. True?

    There’s also this: While the net seems certain to survive, who knows if it will continue using the same kind of hardware architecture it does today? If our descendants deploy, say, a quantum web using zero-energy storage fields, will they bother migrating all that stuff sitting on those quaint metal servers?

    Reply

  21. Books can burn or get wet just as easily as servers can crash and CDs can deteriorate. Probably the best tool to make sure your stuff lives on longer than you do is people, people who are dedicated to preserving your memory. How many unknown authors were there 100 years ago whose stuff has been lost forever, because nobody cared enough to save it? Publishing on paper didn’t help them. Whereas the Mark Twains, et al, had people who dedicated themselves to preserving their work, and still do. Even John Kennedy Toole had his mother.

    I think Dave has enough friends and fans who would busy themselves archiving his writings and setting up “Dave mirrors” if he were to leave us. As for the rest of us, I guess we just have to make enough of an impact that somebody out there wants to do the same for us.

    Reply

  22. Scott: there is an easier way to get perpetual fame than through blogging. Becoming a serial killer, for one. :)

    Reply

  23. Posted by Kevin Kirwin on March 3, 2007 at 9:06 am

    Take a look at what Franklin setup to run for 200 years.
    The results of his bequest are different than he envisioned but still active.

    http://www.bfit.edu/aboutus/history.php
    The Codicil Bequest: 1789

    Kevin

    Reply

  24. >I’m already caring for the archive of my uncle who passed in 2003. What >will become of his blog when I pass? To have it disappear then is simply >not acceptable. And as a software developer, I want to be sure I have >answers for less technical people, for the Dostoyevsky or Huxley among >us, for the Picasso or Chagall, for the Ives, Copland and Berlioz. We >believe digital is better, but how will people know how we lived two or >three generations from now? That’s a problem I want to work on.

    I think that these are questions alot of people eventually ask themselves and have also been addressed by many philosophers.
    Perhaps, regardless of what actions we as individuals take to preserve our work, things are structured in such a way so that if a piece of work is worth preserving, it is so automatically (say a symphony, a novel etc..etc..) and thus withstands the test of time.
    Of course one can also add that many things that may have been worth preserving have been and are continually being destroyed and we never find out about them therefore we may never know what was really worth preserving.
    Anyway, good luck on your quest and know that our thoughts are with you.

    Reply

  25. Dave –

    Saw this video of some beta Apollo/S3 integration and thought you might enjoy it — some cool desktop/S3 integration there via some libraries that will be opensourced.

    http://video.onflex.org/2007/05/08/christian-cantrell-apollo-beta-sneak-peak-amazon-s3-client/

    Cheers –

    Bryan

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 60 other followers

%d bloggers like this: