Website outage and recovery (and kudos to IDrive)

So if you have been following along in the last 24 hours, you’ll know that this site was down for most of yesterday. It went down just before 11am. As of 2:30am EDT on July 1, the site is back up and by my estimates, 99.9% restored.

The specifics of what happened are not completely clear. What I do know is that the problem was with my hosts MySQL servers. This is a WordPress-based blog, self-installed, and I use MySQL in conjunction with the installation. Something in MySQL failed. My first indication was when I jumped to the site just before 11am and found that the most recent post was from June 15–some 15 days ago! I got a little panicky as you might imagine. Then, I noticed that the site began to fail. That was likely due to the fact that MySQL was failing at the host site. After several hours, the host sent out a message to affected persons, myself included, letting us know that there were in fact problems with sites using MySQL. They indicated 12-18 hours to restore the problem. Given that time frame, I know from experience it is usually a restore from backup (often from tape) of a known-good configuration.

Still somewhat stress, I woke up around 1:30am to discover the site was working again. The host indicated everything was fixed. However, I was still missing 15 days worth of data on the blog.

Fortunately, in addition to using IDrive to backup all our home computers to the cloud, I also use the IDrive plugin for WordPress to backup this site. The backup includes a backup of the database. So I’ve spent the last hour restoring the site from IDrive (a very simple procedure) and then going through the SQL dump file and selecting out only those last 15 days worth of posts, tags, comments, etc. to restore to the site. That was a little more tricky but only because I was not trying to do a full-restore, but restore only certain rows in the database. It took me about an hour, and as you can see, things are back to normal here.

Almost. I said that I think the restore was 99.9% successful. It is possible that very recent comments on the site have not been restored. These would have been comments made within 24 hours of the failure, and they would have been missing from the last backup. However, I am not certain that any comments are missing. If you happen to notice that one of your comments is gone from a day ago, let me know. I apologize for that.

Things happen. In my day job, as an application developer, I see servers go down from time-to-time. Even in the best of circumstances, you can’t have perfect reliability or uptime. My experience overall has been pretty good. And I learned my lesson about backups a long time ago. I realize that many people do backup their data on their computers, but I wonder how many people backup their websites. Fortunately, I do, and I’d highly recommend IDrive for both home computers and WordPress websites. It really saved my bacon today.


