Over Thanksgiving vacation I decided I wanted to see how various subreddits were connected and what their relative sizes were. Some projects seemed to tackle this goal, but I didn’t like the interface or how old their datasets were so I started to build my own. I began by building a scraper that would look at a specific subreddit, find the related subreddits section (for this I used PRAW), parse the section for the subreddit links, and build a map of the connections.  The application would visit each subreddit referenced until all known subreddits were visited.  The code for the crawler is here:

Since crawling the entire site (with rate-limiting) took a couple of days I eventually updated the crawler to insert the additions into MongoDB. This ensured progress would not be lost if the application crashed or the internet connection was interrupted. Once the dataset was generated I wanted to make an interactive graph anyone could access on the internet. So first I needed a simple web server that would accept a few parameter; subreddit, graph depth, and nsfw. Without much trouble I got a flask server to return a static graph image using Python’s graphviz library. I had a little experience with Heroku so decided to put my current work up there.

Having a public interface to my project, I was emboldened to improve the usability and wanted to try out D3.js.  From the D3.js homepage I found an interactive graph example that would suit my needs. After altering the graph data to match the D3.js format I was able to get what I wanted working in JavaScript. This opened a lot of options for me to make the nodes draggable, turn the nodes into links, change the size of the nodes to represent the subscriber counts, and dynamically color the nodes and links to make graph look more attractive.

subredditor sample graph

I still have a lot of changes I want to make to the project when I have time. The database currently uses MongoLab’s free tier which makes the deployment a lot slower than my development environment. I eventually want to update the crawler to use Postgresql’s hstore then I can leverage Heroku’s Posgresql support. Likewise, while deploying in Heroku is very convenient it also imposes many constraints. Migrating to a VPS would force me to work at all levels of the deployment.

The code:

Live instance:

Time and Time Again

My brother likes to write a lot of fiction, so I was challenged to write some too.

Story time:

It’s cold down here, how far down I may never know.  Yet depth is inconsequential after all, it’s not like my body is sturdy enough to break through steel.  All that matters is that I get out of this metal tomb before time itself is over.  If I had known that this would be my fate, I would have never been so reckless.  Now I must wait until the Sun swallows this planet, or someone digs me out by folly.  If I am really lucky this putrid swamp, my subterranean home,  will be the future home of a suburban mega-mall.  That way at least excavators will surface me in the process of making space for a sub-level.  Of course with my luck, they’ll pave right over me to make the parking lot while all the real construction happens a hundred yards away.

Who would have thought there’d be such an intractable downside to immortality?  After what I’ve been through my body could survive a sub-orbital fall or even a nuclear blast, but something simple like getting buried leaves me impotent.  I guess  it’s such a simple solution to the problem of ‘How do you hurt a man who can’t die?’, you make him wish he could.

So I might have ruined my research partner’s life, but this is by far an overreaction.  We had been working together for a decade on real ground breaking genetics research, but couldn’t get government approval on human testing.  All of our animal subjects had recently been responding to the treatment brilliantly.  Their bodies became literally invulnerable to anything we could throw at them, their need for food and water became purely recreational.  We were onto a new form of life altogether!  However, our funding dried up overnight, an ethics panel at the University demanded the entire project be scrapped, and I really had no other options.  I had to demonstrate my life’s work no matter the cost, I had to live my research!

When the Provost interceded I was in no mood to negotiate, I stole everything; samples, access cards, lab documents, specimen–I cleaned the place out.  We were under strict orders to comply with University directives or both us were going down, and neither of us could be considered to be on the administration’s good side.  Naturally when they realized what I had done, both of us were blackballed.  It didn’t matter that I worked alone, they were happy to have a reason to ruin his career, they had been looking for an excuse.  I should have cared more, but all that mattered is knowing…knowing what it felt like to be a super human.

After giving myself the treatment I became a daredevil, running through the street picking fights with anyone knowing they couldn’t harm me.  At one point I ran into traffic on the freeway just to cause a thirteen car pile-up.  Once the firefighters pried the dead metal off my body I bolted off crazed, refusing to answer for my insanity, pleased by my invulnerability.  Of course it didn’t take much time for me to become a wanted man, a thief, a menace to society, and an enemy to my academic community and the only man I would call my equal.

Unfortunately, the man I pissed off knew me better than I thought.  I suppose I had mused on my possible hideouts around town, not realizing he would remember any.  Who would have imagined I would actually try to find shelter in one?  While I rested in an abandoned brewery, my temporary home until I figured out an alias and fourth on my list, my partner intercepted me.  Knowing full and well what I had become and losing his own grip on reality he hand-cuffed me in my sleep and secured me in a metal drum.  With enough struggle I would be able to free myself from the clumsy cuffs, but he managed to weld the drum shut.  No amount of struggle or contortion would get me out of this one.  It didn’t take long for him to roll the enclosure onto his truck and dump it in a swamp outside town.

I know the swamp will eventually dry up, but I will still be down here covered by layers of solid sediment.  In the pitch black of this drum I can do nothing more than wait.  I have found unlimited time, and need of nothing, but have lost all mobility.  I am helpless to do all but wait.  With hope my genes will revert to their natural state, and I will die an unremarkable death.

Drawing Lambdas

The other night I was watching a video on the lambda operator in Ruby 1.9.1, which can be spelled out or represented as ‘->’.  Coming from a C++ background and having dabbled in Scheme and Haskell ‘\’, I felt a little uncomfortable with this syntax.  So my first thought was that it’d be nice if you could knock off all of this ASCII tomfoolery and just use Unicode lambdas in code.  This leads to two problems though:

  • Unicode friendly keyboards don’t really exist
  • Source code tends to be strictly in ASCII

Typing Unicode

Let’s deal with the Unicode keyboard problem first…

One method to get lambdas into your code is to use digraphs in Vim.  To get a full list in the editor use the command ‘:digraph’, this will draw a matrix of every combination available.  As depicted in the screenshot below (scrolled down to Greek symbols) each column has the two-key combination, the symbol, and the Unicode decimal representation.  To insert a symbol use Ctrl-k and the two-key combination.  For example,  lambda is Ctrl-k l*.

What’s really nice about the digraphs table is that many combinations are easy to figure out.  For example, all Geek symbols are the Ctrl-k prefix and the letter followed by an asterisk.

Some are even obvious:

Diacritic Example Digraph
macron ē (letter + minus)
trema/umlaut ü (letter + colon)
cédille ç (letter + comma)
circumflex û (letter + caret)

Unicode in Action

Now all this is fun to play with a bit, but your compiler/interpreter will likely be upset with your latest additions. The few exceptions will be languages that are very friendly to Unicode strings like Go and Python.


package main

import fmt "fmt"

func main() {
    fmt.Printf("Γεια σας κόσμο!\n")

#!/usr/bin/env python

def main():
    print("Γεια σας κόσμο!\n")

if __name__ == '__main__':

However, when I said I wanted to see lambdas in my code I didn’t mean inside of string literals, I meant actual lambdas as keywords.  If I try to put a lambda symbol in place of the keyword ‘lambda’:

#!/usr/bin/env python

def main():
    # λ == lambda
    square = λ x : x * x
    nine = square(3)

if __name__ == '__main__':

I will get the following error:

File "", line 5
square = λ x : x * x
SyntaxError: invalid syntax

Faking It

This leads me to my second point; you don’t really want lambdas (or any non-ASCII) in your code, but you may want to see lambdas in your editor.  In Vim, you could apply the conceal patch, which is standard in recent versions of Vim 7.3.  Conceal allows you to create syntax rules for drawing specific keywords and string matches as any symbol you like.

For instance, the Vim command:

:syn keyword Operator lambda conceal cchar=λ

will swap out the keyword lambda with the symbol λ, but not modify the text.  Moving the cursor over the lambda symbol will reveal the letters for editing.  Your compiler never has to know you were experimenting with Unicode.

.vimrc rules

There  are already many script implementations that do this kind of thing, but I  really like this solution for it’s ease of implementation.  I’ve got the following rules in my .vimrc, that I’m sure to tweak until I’m satisfied.

if has('conceal')
    if has('autocmd')
        autocmd Syntax * syn keyword Operator not conceal cchar=¬
        autocmd Syntax * syn keyword Operator lambda conceal cchar=λ
        autocmd Syntax ruby syn match rubyKeyword "->" conceal cchar=λ
        autocmd Syntax haskell syn match hsKeyword "\\" conceal cchar=λ
    hi! link Conceal Operator
    set conceallevel=2

Essentially, there are four Vim syntax rules that don’t live in individual syntax files:

  • Draw not as ¬ for every file type.
  • Draw lambda as λ for every file type.
  • Draw -> as λ for all Ruby files.
  • Draw \ as λ for all Haskell files.

These rules are typically placed in syntax files, but I’m still test driving them and I don’t want to repeat the first two rules for every syntax file.  For the time being they can stay in my .vimrc where I can keep an eye on them.

Now Stretch

The whole point of going through these steps is attaining more flexibility.  Hopefully now you feel less restricted by your keyboard and your languages’ syntax.

If you feel Pascal’s not equal, <>, makes your skin crawl, you can change it with:

:syn match pascalSymbolOperator '<>' conceal cchar=≠

or to add to your .vimrc

autocmd Syntax pascal syn match pascalSymbolOperator '<>' conceal cchar=≠

For example:

while (a <> b) do WriteLn('Waiting');

is shown as

while (a ≠ b) do WriteLn('Waiting');

Everyone has completely different preferences; so if you don’t like something, change it.

The Wonderful World of tmux

tmux in my opinion is the best tool for interacting with the shell.  Having been a daily user of tmux for the last 2 years, I’ve put a lot of thought into customizing it to suit my needs.  As an ArchLinux user I found the tmux ArchWiki to be an excellent resource for picking up tips on using and customizing tmux.   My own .tmux.conf is available on github and for the remainder of this post I will be breaking my configuration down.

Preliminary (ditch ‘Caps Lock’)

This step isn’t necessary, but considering I use the Control key almost as often as the spacebar, I find it cuts down of the hand contorsion.  I learned from using Emacs that if I’m going to tap the Control key all day long it needs to be on the home row.  There’s usually an option in the keyboard settings for your desktop environment to change the behavior of Caps Lock to be another Control key.  I have the following in my .Xmodmap file which loads when X Windows starts:

remove Lock = Caps_Lock
keycode 0x42 = Control_L
add Control = Control_L


The first thing that had to go was using Ctrl-b as the command key, the key combo that preceeds every tmux command.  Even with the Control key on the home row, I find my hand a bit stretched for a gesture I will have to make everytime I invoke tmux.  I suppose Ctrl-a and Ctrl-x were avoided because of GNU Screen and Emacs respectively, but I chose Ctrl-f since my left hand on the keyboard can make that combo nicely from a resting position.

# Change prefix key to Ctrl-f
unbind C-b
set -g prefix C-f

Next, prefix-d detaches the session by default. Considering my previous change I accidentally hit this combo too often, so I unbind(ed) it and type out the ‘detach’ command when I want to detach a session (which isn’t often).

# Remove shortcut for detach session
unbind d


As you may know one of the features of tmux is to create numbered windows with prefix-c, which can be navigated forward and backward with prefix-n and prefix-p respectively.  By default tmux starts numbering windows at zero, which is inconvenient because zero and one are far away from one another on the keyboard.  So cycling between the between windows 0,1,2, and 3 feels unnatural.  I resolve this by setting the base index to one instead of zero.

# Start numbering at 1 intead of 0
set -g base-index 1

I also find myself switching back and forth between two windows frequently, which prefix-l can do nicely.  However, using another key seems unncessary when I can get the same effect by double-tapping the prefix command (for me Ctrl-f).

# Last active window
unbind l
bind C-f last-window

Likewise, prefix-& [Ctrl-f Shift-7] to kill a window seems very uncomfortable, which may be the idea to prevent accidentally closing windows.   Instead, I decided to use prefix-k for ergonomics.

# Kill window
bind C-k kill-window

Pane Management

To me the panes in the tmux are the most fun to work with, especially because they don’t require finicking with the mouse to line up code in windowed terminals.  Although the default mappings could have been much simpler.  My thinking is that a vertical bar means split vertically, and a horizontal bar means split horizontally.  So I cheat a little by using the ‘-‘ and ‘\’ keys to represent ‘_’ and ‘|’, which become my new horizontal/vertical split commands.

# More straight forward key bindings for splitting panes
unbind %
bind \ split-window -h
unbind '"'
bind - split-window -v

To move around in tmux you can use:

  • prefix-o, to cycle through panes
  • prefix-[up, down, left, right], to select by direction
  • prefix-q and enter the pane number, to go directly to the numbered pane

As for resizing the panes I went along with vim’s navigational keys to indicate directions to ‘push’ the panes.  Therefore, left (prefix-h), right (prefix-l), up (prefix-k), down (prefix-j).  To move 5 times the distance of the previous commands hold shift for each.

# Pane
# Make choosing the pane similar to vi navigation
set-option -g mouse-select-pane off
bind h resize-pane -L
bind l resize-pane -R
bind k resize-pane -U
bind j resize-pane -D

# Use the vi directions for resizing panes too
bind H resize-pane -L 5
bind L resize-pane -R 5
bind K resize-pane -U 5
bind J resize-pane -D 5

Copy and Paste

One of the things I found completely baffling in GNU Screen was how they decided to map the copy/paste functions.  tmux did a better job with open/close square brackets, but I decided to give up and use prefix-Ctrl-c and prefix-Ctrl-v.  Otherwise, I feel like copy/paste in tmux and screen are neglected because the defaults  don’t make much sense.  Also adding in a binding for xclip makes copy and pasting in tmux more practical.

# Copy mode
unbind [
bind C-c copy-mode

# Paste mode
unbind ]
bind C-v paste-buffer

# Move tmux copy buffer into x clipboard
bind-key C-y save-buffer /tmp/tmux-buffer \; run-shell "cat /tmp/tmux-buffer | xclip

As a vim user, I opted to have the mode keys follow the vi/vim conventions.  I suggest looking at the short table in the man page for tmux to see exactly what selecting vi or emacs for this option entails.  As for the mouse mode, tmux can acknowledge when the mouse buttons are depressed and allow for pane selection and resizing with the mouse.  I have this disabled though because I prefer keyboard navigation and it interferes with my selection/copy/paste operations too much.

# Use Vi mode
setw -g mode-keys vi
# Make mouse useful in copy mode
setw -g mode-mouse off


The following configurations determine what your tmux environment will look like, specifically conventions for text colors and window titles.

From my screenshot you can see the decisions I’ve made with regard to colors.  There are three panes open in the active window and there are four windows (each running a different application).  The active window is red and has an asterisk, the previous window’s title has a hyphen at the end.  tmux has the option to indicate activity on each of the windows from the toolbar, but I have this disabled because some applications constantly write to the shell.

# Status Bar
set -g status-bg black
set -g status-fg white
set -g status-interval 1
set -g status-left '#[fg=green]#H#[default]'
set -g status-right '#[fg=blue,bold]%m-%d-%y #[fg=red,bold]--#[fg=white,bold]%I:%M:%S#[fg=red,bold]--#[default]'

# Notifying if other windows has activities
setw -g monitor-activity off
set -g visual-activity off

# Highlighting the active window in status bar
setw -g window-status-current-bg red

Obviously, if you ever intend to use the clock in tmux you can set the color and mode.

# Clock
setw -g clock-mode-colour red
setw -g clock-mode-style 12

I always make sure my history has plenty of lines, since you can usually spare the memory and you can search through the history by entering copy mode and selecting ‘/’ (if mode-key is set to vi).

# History
set -g history-limit 100000

Lastly, remember to reload you .tmux.conf file after making changes.

# Reload the config file
bind r source-file ~/.tmux.conf


Every change I have made is a response to an itch I’ve experience when using a default tmux setting.   Being able to go into the config and get some relief is a beautiful thing.  Of course every change I outlined works great for me and not necessarily anyone else, so keep tweaking your conf until you find inner peace.

Once again, my .tmux.conf file can be view and downloaded here.

Just Do It

I have a terrible habit of grabbing a technical book, reading the first two chapters, and then putting it down to read the first two chapters of another technical book. I’ve tried to keep myself on task with several tools; tasks on Google calendar,, and now

My latest book to finish is MongoDB in Action.  I’m a few pages into chapter 3 and I’m up to my old tricks again.  The only thing I can think to do is just keep going, and maybe write about it a bit to remind myself not to give in.  So I don’t forget, here are some reasons why I need to finish this book:

  • I need to do a better job finishing what I start, and now is when I need to do it.
  • It’s about time I learn the virtues of NoSQL.
  • MongoDB has an excellent Python driver, it’s a shame not to use it.
  • SQLite (my goto small application database) isn’t designed for the web domain.
  • The book uses Ruby and Javascript in examples, which I want to give more attention.
  • I am starting a web inventory project and these topics need to be learned.

Well unless I can train a capuchin monkey to bite me when I try to start reading something else, I will have to be accountable for myself.  Time to get back to reading…

Hurricane Sandy

Of course I don’t have much to say about the storm, other than I’m pleased that my utilities have stayed on and I haven’t seen any trees obliterated…so far.  However, it is worth noting that where I live is notoriously susceptible to flooding.  As of 9:30am the tidal flooding hit it’s peak causing my apartment building to be almost completely surrounded by water.

I’m on the third floor, so the floor water isn’t doing me much harm, other than keeping me in place.  Meanwhile, as I watch the rain come down trucks, jeeps, vans one by one drown out in full view of my balcony.  It’s very unfortunate because it’s comparable to driving through a lake, and it’s obvious to from behind the wheel. Because I really think they are worth sharing, here are a couple short videos of the flooding in my immediate vicinity.

Back parking lot:

Front of building, looking down on Haven Creek Boat Ramp:

Lastly, I’ve aggregated all of my photos into a public album on Google+ if anyone would like to see more pictures of my waterfront (and back) view.  Stay safe everyone.

Tiny Code

I just found out about this neat little subreddit,, and thought I should spread the word.  The idea is to share neat snippets of code that do impressive things with a focus on minimalism.  One thing I especially like is that the site isn’t merely code golf and obfuscation, but instead powerful solutions in relatively few lines of code.  The major benefit of keeping things small is that you’ve boiled a problem down to as few operations as possible.

As a matter of finding learning material, you usually only want the part of the solution you don’t yet understand and not a full-blown application with features currently over your head.  After all, when trying to work out the tiny but complicated and time-consuming portions of a project I’ll search through blog posts and Stack Overflow submissions until I piece together to a solution suitable for my implementation.  If a snippet is too long it’s not likely it will work well in a general case.  If it’s too complicated to be readily understood it’ll get put on the back-burner until I find a better solution or give up and start digesting it.

Here’s a submission that won’t take long to wrap your head around: Four lines of Python for a Spellchecker

import sys
a = sys.argv[1:]
s = [x.lower()[:-1] for x in open("/usr/share/dict/words")]
print " ".join([("\x1b[31m"+w+"\x1b[0m" if w in [_ for _ in a if _ not in s] else w) for w in a])

$./ hello this woard is wrong
hello this woard is wrong

I will admit code golf can be pretty cool, this little gem emerged in the comments:

print(' '.join('! '[w+'\n'in open("/usr/share/dict/words")]+w for w in __import__('sys').argv[1:]))

$./ hello this woard is wrong
hello this !woard is wrong

As for actually creating a more sophisticated spell checker, I recommend Peter Novig’s write-up, and‘s NLP class. As an added benefit I learned about the “words” package, which is nice because I always forget where I store my one time use dictionary files.

Venus in the Sky

After walking around for 20 minutes trying to make a pinhole camera with index finger I decided to stop ruining my eyes and enjoy the sunset. After all, the really impressive imagery is being streamed live over the internet. On my way back to my apartment I noticed a small group of Jehovah’s Witnesses enjoying the transit outside of their Kingdom Hall. Since they had a couple of fancy reflector telescopes set up, I decided it would be worth it to introduce myself and get a look at their setup. Oddly enough no one responded to my introduction, so I just admired the projection of the transit as they spoke amongst themselves. Below is the photograph I took with my cellphone.

Transit reflected on projector screen.

The transit through a reflector telescope.

Overall, I’m really happy they were out observing so that I could get a view of the event directly. I also decided to take a photo of the sunset for reference.

Sunset during the transit.

Sunset during the transit with slight cloud cover.

Look out for the next transit of Venus in 2117!

PyGTK TreeView Rubber Banding


Enable multiple selections in a GTK TreeView using the mouse (click and drag) to select the desired nodes/rows.


For the most part this is a very simple and straightforward task; however, the first time I tried to do it I found the number of examples to be a bit sparse.  First off I’ll be using the TreeView widget example from  If you are new to TreeViews you should read the official documentation first.

The following file can be downloaded from here.

#!/usr/bin/env python

# example

import pygtk
import gtk

class BasicTreeViewExample:

# close the window and quit
def delete_event(self, widget, event, data=None):
return False

def __init__(self):
# Create a new window
self.window = gtk.Window(gtk.WINDOW_TOPLEVEL)

self.window.set_title("Basic TreeView Example")

self.window.set_size_request(200, 200)

self.window.connect("delete_event", self.delete_event)

# create a TreeStore with one string column to use as the model
self.treestore = gtk.TreeStore(str)

# we'll add some data now - 4 rows with 3 child rows each
for parent in range(4):
piter = self.treestore.append(None, ['parent %i' % parent])
for child in range(3):
self.treestore.append(piter, ['child %i of parent %i' %
(child, parent)])

# create the TreeView using treestore
self.treeview = gtk.TreeView(self.treestore)

# create the TreeViewColumn to display the data
self.tvcolumn = gtk.TreeViewColumn('Column 0')

# add tvcolumn to treeview

# create a CellRendererText to render the data
self.cell = gtk.CellRendererText()

# add the cell to the tvcolumn and allow it to expand
self.tvcolumn.pack_start(self.cell, True)

# set the cell "text" attribute to column 0 - retrieve text
# from that column in treestore
self.tvcolumn.add_attribute(self.cell, 'text', 0)

# make it searchable

# Allow sorting on the column

# Allow drag and drop reordering of rows



def main():

if __name__ == "__main__":
tvexample = BasicTreeViewExample()

When run, the result will look like this:

At this point clicking and dragging you mouse will only move nodes around and will not allow multiple selection.

In order to do rubber band selection you must:

1) Set the “rubber_banding” attribute to be true for the TreeView.

2) Get the TreeView’s TreeSelection object, and set the mode to gtk.SELECTION_MULTIPLE.

self.treeview_selection = self.treeview.get_selection()

Adding the 3 lines above to the example code at line 39 will allow rubber band selection of a TreeStore:

The selection works just as well if the TreeView’s model is a GTK ListStore instead of a TreeStore.  The changes necessary to use a ListStore have been highlighted.

#!/usr/bin/env python

# example

import pygtk
import gtk

class BasicTreeViewExample:

# close the window and quit
def delete_event(self, widget, event, data=None):
return False

def __init__(self):
# Create a new window
self.window = gtk.Window(gtk.WINDOW_TOPLEVEL)

self.window.set_title("Basic TreeView Example")

self.window.set_size_request(200, 200)

self.window.connect("delete_event", self.delete_event)

# create a TreeStore with one string column to use as the model
self.liststore = gtk.ListStore(str)

# create the TreeView using liststore
self.treeview = gtk.TreeView(self.liststore)

# add rows to the liststore
for i in range(8):
iter = self.liststore.append()
self.liststore.set (iter, 0, "item " + str(i))

# add rubber-banding
self.treeview_selection = self.treeview.get_selection()

# create the TreeViewColumn to display the data
self.tvcolumn = gtk.TreeViewColumn('Column 0')

# add tvcolumn to treeview

# create a CellRendererText to render the data
self.cell = gtk.CellRendererText()

# add the cell to the tvcolumn and allow it to expand
self.tvcolumn.pack_start(self.cell, True)

# set the cell "text" attribute to column 0 - retrieve text
# from that column in treestore
self.tvcolumn.add_attribute(self.cell, 'text', 0)

# make it searchable

# Allow sorting on the column

# Allow drag and drop reordering of rows



def main():

if __name__ == "__main__":
tvexample = BasicTreeViewExample()

I hope this helps, enjoy your newfound functionality of drawing shaded rectangle over your TreeView!

13 Years of Reading (Stats)

My Grandmother, when she was alive, was a voracious reader. No matter how thick the tome she’d be through with it in a matter of hours.  Having struggled with reading in grade school I was skeptical of her pace, I still won’t finish a short story without at least spending some time to stop and digest the content. She was a happy speed reader, but I knew there was no way I would be too.  Out of curiosity I wanted to know how it affected her retention, so I quizzed her on past readings authors, titles, subject, but her recollection was very limited. Not only has she forgotten plots, but authors and titles as well. While it should have been no surprise to me that books she read over 40 years ago are long gone, it was also horrifying to think how a book I would literally spend months trying to finish would be forgotten to the point where the entire experience is lost.  Billy Collins, in his poem Forgetfulness, eloquently described how what I had just realized was an inevitability:

The name of the author is the first to go followed obediently by the title, the plot, the heartbreaking conclusion, the entire novel which suddenly becomes one you have never read, never even heard of, as if, one by one, the memories you used to harbor decided to retire to the southern hemisphere of the brain, to a little fishing village where there are no phones. Long ago you kissed the names of the nine Muses goodbye and watched the quadratic equation pack its bag, and even now as you memorize the order of the planets, something else is slipping away, a state flower perhaps, the address of an uncle, the capital of Paraguay. Whatever it is you are struggling to remember, it is not poised on the tip of your tongue, not even lurking in some obscure corner of your spleen. It has floated away down a dark mythological river whose name begins with an L as far as you can recall, well on your own way to oblivion where you will join those who have even forgotten how to swim and how to ride a bicycle. No wonder you rise in the middle of the night to look up the date of a famous battle in a book on war. No wonder the moon in the window seems to have drifted out of a love poem that you used to know by heart.

After the conversation with my Grandmother, I became paranoid about losing what I would invest so heavily in and started keeping records of the books I finished.  I made sure that just after reading the last page I would append some information about the book to a spreadsheet. So far I have been adding to it over the last 13 years, making it one of the few habits I keep.

I decided it’d be fun to visual this small amount of data I’ve been slowly compiling, so I synced up my spreadsheet with goodreads and exported a CSV to play with. Considering I wanted to delete most of the columns with no valuable information, I needed something quick and dirty to edit the sheet.  Out of curiosity I checked if there was way to get Vim to parse a spreadsheet well enough, which led me to the csv.vim plugin.  After a quick install I browsed the sheet and decided I only wanted the dates I read each book, the page counts, and the publication dates.

In favor of doing things the “hard-way” I wrote this one-liner:

for year in {1999..2012}; do cat goodreads_truncated.csv | grep "$year/" | cut -f2 -d'"' | awk -v year=$year '{S += $1} {count += 1} END {print year "\t" count "\t" S}'; done

to give me this table:

1999 1  272
2000 6  1184
2001 15 3874
2002 25 7470
2003 19 4277
2004 31 6917
2005 7  2157
2006 6  1243
2007 3  1143
2008 5  1362
2009 1  450
2010 1  247
2011 14 3644
2012 5  1133

That is, I used the for loop to grep for each of the years from 1999 to 2012 (YYYY/MM/DD), used cut to return the field with the page numbers, passed the year variable into awk and had awk add all the pages by year, count the number of books per year, and format it into a table.

Lastly, I put the data onto a Google Docs spreadsheet (without 2012) to look at and got the following chart:

I used pages despite their inconsistency because they still reveal more than book count, consider Moby Dick versus The Importance of Being Earnest.  The average pages per book in are the red bars, so however many can be stacked up against the blue bars are the books read that year, and average size.  I also graphed out the publication dates against the read dates, but all it showed was I read mostly contemporary literature.

Being a slow and picky reader, I think my data set is rather small but it still reveals some interesting details over the last 13 years of my life.  I attended high school between 2000 and 2004 so I had books I needed to read as part of my course work, I was beginning to read seriously for the first time, and I spent a fair amount of time waiting on buses. I took AP English between 2003 and 2004, and college was the following year, so a lot was crammed in before I had worry about college courses.  Between 2004 and 2008 my reading (free time) dropped off substantially.  From 2008 on I have been working full-time as Software Engineer and spare time is more often spent on projects and technical books that don’t meet my cover to cover criteria for being added.  In November 2010 I purchased a Kindle and have been trying read like I used to.

Hopefully, in another 13 years I’ll have more data to play with and a few better ideas of what to do with it.