The Growing Web

I started working on notes from the NDIIPP Web at Risk meeting last week, but it will probably be a few days before I finish. I've been distracted by training a student to do some metadata quality-assurance for our A to Z Digitization Project (and isn't that a gorgeous collection?). However, the meeting was great, so rest assured I'll report soon.

Related to the NDIIPP meeting, this morning I pulled some CyberCemetery web stats (to help the NDIIPP team predict collection sizes). It's amazing how the earliest websites in the CyberCemetery take up such little space, and what mammoths the websites I'm archiving now are. Currently, the CyberCemetery consists of 45 websites that take up 27 GB; I'm predicting that by fall this year, the CC will have about 55+ websites taking up closer to 45 GB.

Factors that appear to be playing a part in website growth:
  • content (number of pages/links)
  • design (images, CSS files, and other information used for layout/navigation purposes)
  • multimedia (larger amounts of images, audio, video in websites)
    • I guess you could even narrow this last point further, and say that multimedia files themselves are growing larger as images/videos become more refined in quality (resolution, number of megapixels).
I hope to have some time this weekend to edit/post photos from last week's trip, and will post images from Oakland and from San Jose's amazing Dr. Martin Luther King, Jr. Library as soon as they're up.

No comments: