laze.net - Tiny Archives

After writing about digital decluttering, Alex Chan wrote a pair of posts discussing how she uses static HTML and CSS with a touch of Javascript to maintain local “tiny archives.” She is currently writing a series about her local bookmark archive, specifically. I love the bookmark implementation, but the original posts were inspiring to me in particular because of their simplicity and how they just made so much sense.

She starts:

Over the last year or so, I’ve been creating static websites to browse my local archives. I’ve done this for a variety of collections, including:

paperwork I’ve scanned

documents I’ve created

screenshots I’ve taken

web pages I’ve bookmarked

video and audio files I’ve saved

I create one website per collection, each with a different design, suited to the files it describes. For example, my collection of screenshots is shown as a grid of images, my bookmarks are a series of text links, and my videos are a list with a mixture of thumbnails and text.

One of the issues I’ve had with my ~~data hoarding~~personal archiving is that I have absolutely not been organized in my approach. Or, perhaps it’s more accurate to say that I’ve been organized in too many different ways. Individual folders in my collection are nicely sorted, named, and titled, but in others I’ve taken a different approach. Or, even in the well sorted folders, the files in those folders are a mess without any context. As I start to think about how someone else might go through my collection after I’m gone, I’m not sure they’d know what what they were looking at or why I considered it valuable.

Alex writes about trying various different tools to organize her archives, eventually settling on “turning folders into mini-websites”:

I could create an HTML file in the top-level folder, which could be an index – a list of all the files, displayed with all the custom metadata and tags I wanted.

This allowed me to radically simplify the folder structure, and stop chasing the perfect hierarchy. In these mini-websites, I use very basic folders – files are either grouped by year or by first letter of their filename. I only look at the folders when I’m adding new files, and never for browsing. When I’m looking for files, I always use the website. The website can use keyword tags to let me find files in multiple ways, and abstract away the details of the underlying folders.

HTML is low maintenance, it’s flexible, and it’s not going anywhere. It’s the foundation of the entire web, and pretty much every modern computer has a web browser that can render HTML pages. These files will be usable for a very long time – probably decades, if not more.

She quotes Jessamyn, who writes that “no one is ever starting at the beginning” with regards to organizing their data, and the result is a lot of work, no way around it. But it’s enjoyable work, digging through sloppily organized old collections, digging up gems, and figuring out how to organize and present each of them (or, perhaps, explicitly deciding not to do so).

This was one of those pieces of writing that made so much sense to me, but I had to sit with it for a few days to figure out how I was going to act on it. I had a few questions I rolled over in my mind during that time:

Exactly how tiny/simple would I go?
Would each archive be its own project with its own distinctive approach?
What collections could I immediately think of that were worth organizing this way?
Were there ways I could leverage other tools or previous work to speed along the process at all?
How could I start with a basic design that could be easily built on or changed with time?

Thinking Through My Tiny Archive

Before answering any of these questions, a brief word on how I envision the purpose of my archive. This is the archive I’d like someone to be able to pick up after I’m gone, plug into a computer, and then be able to browse what I found useful to save and share. It’ll be a nice mixture of my own preservation projects, things I’ve written or produced on the web over my lifetime, and other personal ephemera I care about. Sure, there will probably other drives full of stuff to go through, but this is what I hope to be the clearest part of my digital legacy. Shoot, maybe someone will even attach the drive to a laptop at my funeral and let people look through it.

To answer the first question above: I’m taking the simplest possible approach. So far, I’m sticking with HTML and CSS. I may use a bit of Javascript in the future, but right now, I’m not.

For question #2, I’m also going simple. There is a base stylesheet and index.html template I’m using for all projects. Individual projects may potentially get bits of custom JS functionality or some slightly different design approaches, but by and large, the archive will be relatively uniform in design.

Question #3 - The Collections

I started by brainstorming what types of things I might want to include in my own archive. Here’s a sample:

Family history-related projects (which would include the books I’m writing, the research I’ve done, photos I’ve digitized)
My social media content over the years
Audio files (like the podcasts I’ve produced, the podcasts I’ve been on, music I produced and recorded, interviews I’ve done)
Video files (VHS rips, things I’ve posted to YouTube)
Web projects (if not browsable local versions, at least a collection of the “final archive” versions in zip files)
Email (might be interesting to build a searchable archive of all the email I’ve received)

Question #4 - Leveraging Other Tools

Some social media exports provide you your data with a locally viewable HTML interface. As Alex notes, Twitter does this (or, alternately, you could use Darius Kazemi’s great Twitter archive viewing tool). There’s definitely no harm in extracting that archive and linking up to it as part of your Tiny Archive.

Or maybe you want to take a large JSON file and make it more easily viewable and browsable. Leverage existing scripts if you don’t feel like writing your own.

Even on a base level, I found myself looking for a simple CSS file I could start with. I settled on readable.css as a base. I call it from my own style.css, where I store my own styles. This also answers question #5, allowing me to start with a basic design that I can change over time with relative ease.

How I Implemented It

I should note that I only just started this project and I suspect it will take many twists and turns in the coming years. But, it’s underway.

First, I grabbed a fresh 5TB drive to start with.

Then, I planned out my basic file structure. Here’s what I came up with:

Home Directory/
├── index.html
├── readable.css
├── style.css
└── My Special Podcast/
    ├── index.html
    ├── My Special Podcast 001.html
    └── Archive/
        └── My Special Podcast 001
            └── [assorted files]

My index.html template looks like this:

<!DOCTYPE html>
<html lang="en">

<head>
  <meta name="description" content="TK" />
  <meta charset="utf-8">
  <title>TK (Archive/Page Name)</title>
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <meta name="author" content="Me!">
  <link rel="stylesheet" href="/style.css">
</head>

<body>

    <div id="breadcrumb"><a href="/index.html">Home</a> > TK</div>
  
    <h1>TK (Archive Name)</h1>

    <div class="archive-description">
        <p>TK (Archive Description)</p>	
    </div>
    
    <div class="archive-details">
        <h2>Archive Files</h2>
    
        <ul>
            <li>TK (lists, links, etc.)</li>
        </ul>
    </div>

</body>
</html>

I copy it, replace the TKs with the relevant info. (Pro tip: remember that since this is a local archive with no web server, you’ll always need to link to directory/index.html rather than just directory/.)

Getting down to the individual archive item page can get pretty interesting. For instance, on an individual podcast episode, I can embed audio, add in the original description it included, and include a transcript, if I want. Now, I may not be able to do this right off the bat with a 20-episode podcast, so maybe initially I just link to the raw directories with the files and then go back and add individual episode archive pages over time.

This is one point where I’ll depart a bit from Alex’s approach. She noted “I only look at the folders when I’m adding new files, and never for browsing. When I’m looking for files, I always use the website.” While I’m still thinking of the web interface as the primary interface, I may not link to every single file that exists on the drive. For instance, with a podcast episode, the folder could also contain a WAV version of the file and even the working project files from Audacity. On the web version, the most important stuff will be linked directly and ideally will also give a brief overview of what other content might live in the directory. Whether I’ll link to the directory and allow for browsing through the user’s browser or encourage them to open up the path in Explorer/Finder, I haven’t decided yet.

Conclusion

After years in “the game,” I’m tired of trying to deal with changing trends, new frameworks, dependencies, and any reliance on outside providers. So, when the opportunity presents itself to approach a project in an extraordinarily simple way that is both forward- and backward-compatible, I’m jumping on it. These bits may not have the longevity of lignan-free paper stored in a temperature controlled archival situation, but I do feel a lot better knowing that I’m building a tiny archive that will be readable a good way into the future no matter what other technologies may jump to the front of the line.

And, oh yeah… backups

Hey: remember to back stuff up. I’m doing this manually by periodically making a zip file and uploading it to a secure cloud storage service. I’ll likely also make another local copy of it and perhaps once a year rotate that drive to a friend or family member to keep at their house.