# Tag: personal analytics

I got my FitBit Ultra pedometer back on March 9, 2012. I’m interested in data and was inspired in part by Stephen Wolfram’s post on personal analytics. Since then, I’ve used my pedometer every day. It has become part of my routine, and I don’t even think about it. The pedometer collects minute-by-minute data, which can be accessed through FitBit’s API, and I have a Google Spreadsheet that captures this minute-by-minute data each day.

Not long ago, there was a kind of “how-to” post on the Wolfram blog that described how to analyze this kind of data using Mathematica. I followed their instructions and produced some interesting results, for six months worth of pedometer data.

First, some basic totals for the six months from March 9, 2012 through September 12, 2012 (189 days)

• Total steps: 1,638,416
• Avg. steps/day: 8,715
• Total miles: 812
• Avg. miles/day: 4.3
• Total floors climbed: 3,277
• Avg. floors/day: 17.4

Next, some bests and worsts:

• Most steps in a day: 17,944 on August 30, 2012 (while I was at Worldcon in Chicago)
• Fewest steps in a day: 2,023 on April 23, 2012

My FitBit Ultra captures steps in 1-minute intervals. The following charts are based on 30,543 minutes worth of data over the last 6 months in which I took at least 1 step during that interval.

First up, here is what my cumulative daily profile looks like:

These are cumulative totals plotted over the intervals during the day in which they fell. From this, you can get an idea of my daily walking patterns. When I got my pedometer, I started a new habit of taking a 20 minute walk every weekday at 10am. It helps clear my head and ensures I get some fresh air during the day. I take this walk regardless of the weather and I think I’ve only skipped the walk two or three times in the last six months. That walk is what produces the spike at 10am. The spike at about 5pm is when I walk to the Little Man’s school after work to pick him up. His school is within walking distance and so it gives me a little more exercise throughout the day. You can also see from this data that I do almost no walking after about 10pm.

Next is my walking habits plotted over time. There are two charts that make up this data. The top chart are the daily totals plotted over the last 6-months. The bottom chart is the monthly averages. The September average is blank because the month is not yet complete:

The numbers on the y-axis are steps. There are places where the minute-by-minute data is missing because I didn’t sync up at night (as for instance when I was in Chicago for Chicon or on vacation in Maine) but the data is generally consistent. The monthly averages were trending down through the end of June and then started picking back up in July. Right now, they appear to be fairly level.

I find this data interesting. I have similar data for keystrokes and some other activities that are easy to automate, but this is a fairly practical look at six months worth of physical activity. It will be interesting to see what the next six months look like.

So as a follow up to my post yesterday, I took a look at the total number of keystrokes I’ve made on my work computer since I started keeping track of such things in early March. It turns out that I just passed (this morning) one million keystrokes since March 9. Just before I started writing this post, the total number of keystrokes on my work machine stood at 1,025,742. As I said yesterday, my key  counter at work doesn’t keep track of the keys that I press, only that I pressed a key. So I can’t tell you what key is the most frequently pressed, or how often I use the backspace key. But a million key strokes seems about right, especially when you consider this data is only captured five days a week. (I’m not typically typing on my work computer on the weekends.)

One things I can do with my work keystroke data that I can’t do with my data on my home machine is produce a diurnal plot of my typing. That’s because my key counter at work keeps cumulative tallies of keystroke counts by minute. So since I started counting, here is my typing behavior. Each blue dot represents a time during which I was typing (how many characters I typed in that minute is not represented on this chart).

There are a number of cool things about a chart like this:

1. You can clearly see the weekend breaks, large vertical stripes every 5 days.
2. At about 10am each day, you can see a white horizontal stripe where I am not typing. This is when I am out taking my morning walk.
3. Other gaps represent either my lunch time, or times that I am in meetings and not typing on my work computer (I will often use my iPad in meetings).
4. I’ve been getting into the office earlier lately. You can see my typing for the day has been starting earlier and earlier since about mid-May.

The nice thing about this data, like the complimentary data at home: I don’t have to think about it or do anything to collect it. It happens behind the scenes without my having to do anything–well, except for the typing that I’d normally be doing, anyway.

Today I passed the half-million keystroke mark on my home computer. I started using a keystroke counter back on March 9, when I became interested in personal analytics. I have a keystroke counter on my work computer also, but this half-million keystroke milestone represents just my typing at home and just on my laptop. I don’t have a keystroke counter on my iPad, where I do a fair amount of my writing when away from my computer.

Still, a half-million keystrokes in just under 3 months is about 2 million keystrokes a year on my home computer.

If you eliminate the spacebar, my most used key is the E key, which makes sense. I’m a little surprised my typing speed is so low, but then I remember that this counts ever single keystroke no matter what I am doing.

When I checked after the first month or so that I was using the logger, I discovered that I used the backspace key about 7% of the time. This has improved somewhat to a mere 2.7% of the time.

Here is a “heatmap” of my keystrokes on my home computer (the redder the key, the more used it is). I’ve eliminated the spacebar from the results.

Finally, if anyone is curious what a “typical” month looks like, here is a chart for the month of May. Keep in mind that I haven’t been doing much fiction-writing lately, and that is reflected in the relatively low numbers:

I use a different keystroke logger at work because I use a Windows machine. The one at work does not tell me which keys I press; it keeps a cumulative total. It does, however, give me minute by minute data so that I can produce a diurnal plot of when I tend to be typing.

And since I imagine I’ll get asked the question, the keystroke counter I use on the Mac is the only one I could find in the App Store: TypingStats and it has worked out fine for me since I started using it.

Last month, I posted some analytics on my behavior over the course of 12 years of work emails that I’ve sent. At the time, I had data for just my work email and just for sent messages. I wanted to look at my personal email activity as well, but there were two things preventing me:

1. I didn’t have the time.
2. I wanted to do it all in Mathematica, which I am slowly teaching myself from scratch.

Well, I found a little bit of time, and I only required a little time because Paul-Jean Letourneau, a lead developer at WolframAlpha, wrote a post on how to use Mathematica to do the kind of email analytics that Stephen Wolfram posted about last month. This was great because the post contained all of the code needed to do this kind of analysis and all I need is some good code examples to quickly learn a new system. The code provided worked almost without change on my own MacBook instance of Mathematica. I had to make a few minor changes to get the mailboxes I wanted. And I had to add the following line to increase the heap space for Java:

`ReinstallJava[CommandLine -> "java", JVMArguments -> "-Xmx3024m"]`

Without that line, the code executed fine for sent mail, but ultimately resulted in an out-of-memory error for incoming mail.

The resulting data is a pretty good look at my personal email use over the last 7 years. We’ll start with email that I’ve sent. This goes back only to 2009 because that is when I switched from Panix to Gmail. The code looked at my Sent Mail folder in Gmail and looked for email sent from my Gmail address. I had years of imported mail from Panix, but the sent messages are from a different email address and I decided not to move things around or change the code to include these. It’s still 3 years of sent mail data which is good enough for some analysis. Here is the diurnal plot of my sent email, a total of 4,382 messages: