10 days of personal analytics data

Last week I wrote about how I was impressed by Stephen Wolfram’s article on personal analytics. I provided some stats on my work-related email analytics. But Wolfram’s article impressed me so much that I have started collecting additional personal analytics data, including steps, using a FitBit activity monitor, and keystrokes using some keylogging software. Both of these are physical activities (typing is physical) that I do throughout the day, so yesterday, having accumulated ten days worth of this data, I created a timeplot of both to begin to illustrate my physical patterns throughout the day.

steps and strokes.PNG

Blue dots represent steps (as measured by my FitBit device). Red dots represent my keystrokes at work only. With only 10 days of data, it is difficult to discern real patterns–other than the fact that I don’t tend to walk and type at the same time. But there are some obvious things that show up. For instance, every workday at 10am, I take a 15 minute walk around the block to give myself a break from the keyboard and clear my head. I did’t take my walk on Monday, March 12, but I did the rest of the week and you can see the small string of blue dots sandwiched between the red dots at 10am. Then, too, since the time has changed, we’ve resumed our evening walks around 6pm. You can also see at 5pm, my walk over to the Little Man’s school to pick him up.

I am also typing at the keyboard for much of the workday. The gaps are either things like lunch breaks or meetings where I can’t really be typing. Once I’ve accumulated more data (several months at least) the patterns should become much more discernible.

Why collect this data? Three reasons:

  1. It’s easy to do. It takes no additional effort on my part. My FitBit device collects and uploads my activity data without needing me to do anything. The same is true for my keylogger. So I’m not spending any additional effort, with the exception of preparing some data collection mechanisms, which was a one-time, early-on activity.
  2. I’m a data-person. Much of my job deals with helping others answer questions through the use of data and data mashups and I’ve learned that you can learn incredibly useful things from data, if you pay attention to it.
  3. I find it interesting, looking at the patterns and seeing what’s there.

I noted that many of the comments on Stephen Wolfram’s post were from people asking how he collected all of his data. In his post, he mentioned that all of this data is collected through automated systems, and that is key. If you had to manually enter this information, it would be too cumbersome. But being able to collect it without thinking about it makes it much easier and more useful. Here is how I collect the data shown in the chart above.

Steps (Activity data)

As I mentioned, I use the FitBit Ultra device. I clip it on in the morning and forget about it. It syncs wireless and uploads its data to a website whenever I am home. The device captures my steps, floors climbed, calories burned, miles walked, etc. FitBit provides, for premium users, the ability to download your data to Excel. But the data download was not granular enough for me. They also provide an API through which you can programmatically access your data. One of the API function calls allows you access to your minute-by-minute data. I created a Google Spreadsheet to store the data, and a Google Code script to access the FitBit data through the API. Each morning, the script runs automatically and pulls down the previous days minute-by-minute step data. This data is added to the spreadsheet. I only pull down records were there were recorded steps because the times that I wasn’t walking can be inferred. This saves space and that it important. A Google Spreadsheet, at present, can contain 400,000 cells. It takes 3 cells to store a record and I average about 200 minutes/day that I am walking so that makes for 600 records/day. If you do the math, you can see that I can only store about 2 years worth of step data in a Google Spreadsheet. Good enough for now, but I want to keep the data as compact as possible.


On my Windows machine, I found some software that is not “spy” software. Much of the keylogging software out there is geared toward finding out what your spouse or kids are doing on the computer. I have no interest in that. What I wanted was something that counted keystrokes. keycounter on Windows does this. It produces one file for each day. The file contains 1,440 rows (1 for each minute) along with a count of the number of keys typed in that minute. There is no log of what was typed, just those counts, aggregated by minute.

I have not yet found one that does this on the Mac. I did find some software that counts keystrokes, and even tells you how frequently you use certain keys. But the counts are aggregated by day, and not by minute so it’s virtually impossible to tell when you were typing during the day. I found some source code to a keylogger that might do the trick, but it will require some modification and I’m not sure I feel like spending the time jumping into Objective C to do the work myself. I may keep looking around.

I’m considering what other data is out there that can be captured automatically and added to my collection of personal analytics data. There’s a lot of it, but you have to consider what is useful. It might be interesting, for instance, to grab and plot the timestamp for all of the pictures I’ve taken and see how they cluster. Better yet, grabbing the GPS data from the pictures will allow me to plot where I’ve been going back quite a long way.

In any case, I find this stuff fun and interesting, even insightful. I’m hoping Stephen Wolfram writes a follow-up to his article.


  1. You definitely are a data person, so I am not surprised you took up this challenge and look at the data.

    I’ve been curious about these fitbit devices, since I’ve seen a number of people in my circles use it.

    1. Paul, what I really like about the FitBit is how unobtrusive it is. I really don’t have to think about it other than clipping it on in the morning. Getting the data is a bit more involved, but certainly doable if you have a programmer background. (That is, if you want the hardcore data like I want.)

  2. Hi!

    Is there any chance that you could publish the Google Code thing?
    I’d like to access my data as well, but have never worked with Google Code before.




This site uses Akismet to reduce spam. Learn how your comment data is processed.