Going Paperless: Scanning to Evernote, Revisited

One of the sets of questions I get asked with a fair amount of regularity has to do with what settings I use to scan documents into Evernote. Do I prefer PDF or JPG? What resolution do I use? Do I prefer one page per note or a multi-page scan? So I thought I’d use this week’s post to talk about my own scanning preferences and settings, and to provide a little insight into how much scanning I actually do these days.

My home office scanner: Fujitsu ScanSnap s1300i

To set the baseline, my primary scanner is the Fujitsu ScanSnap s1300i, which I have been using ever since mid-2012. The scanner is connected to my iMac, and configured to scan directly into Evernote at the push of the scan button on the device. I have not looked at any other desktop scanners since getting this one because this one does everything that I require of a scanner. It performs duplex scanning in a single pass, scans in color and at high resolution, has excellent paper-feeding, and seamlessly integrates with Evernote. I’m sure there are lots of scanners out there that do the same. The Fujitsu just happened to be the most recent one that I tried and when I found that it did everything I needed, I felt no need to keep looking.

My scanning requirements

As I often do in these posts, I will list my requirements for scanning because they play an important role in the settings I use on my ScanSnap. I think it is an important exercise to consider your requirements before making these kinds of decisions because your needs help shape those decisions. In rough terms, here were my requirements for scanning:

  • Had a need to scan 10-20 pages per day, initially.
  • Did not want to have to feed pages individually: scanner must have a page feeder.
  • Did not want to have to re-feed pages often: scanner must have a reliable page feeder.
  • Needed to be able to scan both sides of a page.
  • Needed to be able to scan pages quickly.
  • Needed to be able to scan to PDF format.
  • Needed to be able to scan directly into Evernote.
  • Needed my scans to be searchable once they were in Evernote.

Obviously, the requirements should help drive the decision for the device you choose, but I’ve found that many of the scanners available today can perform most of these functions. Meeting these requirements is really more a matter of fine-tuning the settings of the scanner and the scanning software.

My scanning settings: some tips for scanning into Evernote

For the most part, I use the default setting that came with my Fujitsu scanner. I made only a few minor modifications to those defaults to meet my own requirements. The Fujitsu ScanSnap s1300i had a page feeder that can hold something like 12-15 pages at a time, and has never given me any trouble. The pages always feed smoothly and I can’t think of a single occasion upon which I have had to re-feed a page. It also does duplex scanning, and will skip blank pages, which is nice.

One thing I’ve noticed is that it sometimes scans a blank pages because of light marks that show up on the page, or because the paper is thin and the text from the printed side bleeds through. But this doesn’t really bother me. It doesn’t affect my searching of the document. I never print so it doesn’t waste paper. And if I really want to get rid of that extra page, I can open the PDF in a PDF editor, like Adobe, and remove the page, resave the document and add it back to my note in Evernote.

Let me walk through some of the settings of my ScanSnap so you can see for yourself how I configure things to meet my requirements.

First and foremost, my ScanSnap is set to scan directly to Evernote. This profile is tied to the button on the scanner so that when I hit that button and initiate a scan, the resulting scan goes into a note in Evernote:

Scanner1

The scanning software has the ability to make a searchable PDF. In other words, at the time I scan the document, the scanning software performs some OCR on the scan and embeds the search text within the PDF file. I have deliberately turned this option off:

scanner2

I have turned this off for two reasons:

  1. When this option is turned on, it takes a lot longer to complete the scan. This is because two separate operations have to be performed. First, the document has to be scanned. Then, the document had to be processed for searches and that latter operation can take a little while. My requirement is to scan as quickly as possible, so I turn this off.
  2. Evernote does this for me. When I scan a PDF to Evernote, it automatically makes it searchable, and it does so while its sitting on the Evernote servers, essentially performing the task somewhere other than my machine, so that I am free to move on to the next scan. It does mean that there is usually a lag of a few minutes before the search data is downloaded to my machine, but so far, I have never scanned in a document and then needed to perform a search on it that instant.

Note that this option was turned on by default on my scanner so I had to go into the settings (see image above) and turn it off.

Regarding the speed of the scan, the resolution at which a document is scanned can make a difference. I have found that I don’t really need high-resolution scans for my purposes. I am scanning so that I can get rid of paper, not produce more, so I virtually never print anything I scan. I have found that the default settings for resolution and DPI are fine for my needs, including the OCR that Evernote performs. So I have left these settings as the defaults:

scanner3

I almost never scan to JPG. 99.99% of everything I’ve scanned into Evernote has been in PDF. That is a personal preference, of course. A lot of people scan old photos, but I haven’t scanned many old photos. Most of what I scan are documents and I’ve found the PDF to be the most convenient format for documents. My settings for the document, therefore, look like this:

scanner4

Again, note that I have disabled OCR on the local scanning to help speed up the scanning. But once again, I rely on Evernote to make the PDF searchable, and usually within a few minutes, I see a notification alerting me that the search data has been downloaded for the PDF.

I scan a document at a time, regardless of how many pages that document is, and I like the results to be compiled as a single PDF per document. If I have a 10-page document, I want a single, 10-page PDF, not ten 1-page PDFs. So I’ve adjusted my settings accordingly:

scanner6

Here are some of the miscellaneous settings I use for my scanning:

scanner5

The effect of going paperless on my scanning habits

One side-effect of going paperless, for me at least, has been a general reduction in my need for scanning in the first place. Two years ago, when I started going paperless at home, I developed a daily process to ensure I did my scanning each day, usually after I picked up the day’s mail. Back then I estimate that I’d scan anywhere from 10-20 pages of new paper each day. (I still have not gone back to scan my file cabinet of old paper because I never once have needed it.)

Today, I find that if I scan 10-20 pages in a week, it is a lot. I suspect this is a result of becoming more selective about what I scan, based on what I use and I need access to within Evernote. I also suspect it is a result of my deliberate efforts to reduce incoming paper by signing up for electronic statements, and doing a lot more through email and other electronic means. I still get paper, but not nearly as much as I received two years ago.

The result is that I spend substantially less time scanning than I used to, which tells me that I am doing something right.

If you have tips for your own scanning, settings, suggestions, hints, let’s hear them in the comments.


If you have a suggestion for a future Going Paperless post, let know me. Send it to me at feedback [at] jamietoddrubin.com. As always, this post and all of my Going Paperless posts is also available on Pinterest.

25 comments

  1. Jamie,
    This is good stuff. I have been using the Doxie Go scanner and because one of my key requirements is minimal space, it can be tucked away in a drawer until I need it for some scanning.

    My question for you stems more from what do you do with the paper AFTER you have scanned it. Do you automatically shred/recycle/dispose of it immediately? Do you keep it for some period of time before throwing it away? Do you scan everything then make a second determination on what to keep (i.e. Tax info, legal docs, etc.)?

    Any insights “post-scan” would be really helpful.

    I’d also be interested in your thoughts on going paperless as part of a larger family unit. For example, how does your wife participate (or not) in your paperless lifestyle? How do you manage any differences between you two (if there are differences). Also sounds like your kids are on the younger side, but an interesting discussion would be on small ways to get kids thinking about the preservation/elimination of paper.

    Sorry for the long response, but I enjoy reading your posts and thought I would offer up some possible material for future content.

    Tom

    1. Tom, thanks for the kind words. With very few exceptions, I shred and recycle everything that gets scanned in. In some instances, I’ll save the paper for my kids to use for coloring. Mostly, it gets shredded and recycled. I can probably count on one hand the things I actually keep in hard copy, and I am always looking for reasons to get rid of those as well. A post elaborating on this in more detail is probably worthwhile, though. I can provide more specific examples.

      Regarding the larger family unit–that is a great question, and I’ve added it to my list to cover in a future post.

  2. What do you do about tagging? Do you not tag at all when you scan, letting Evernote search? It seems like that would slow down finding documents — say if you are looking for your receipts for a particular expense report, or for a particular credit card statement.

    1. Bill, I am not a big tagger (see my much more detailed thoughts on the subject) on documents I scan. I do rely heavily on Evernote searches. Maybe I am lucky but I have yet to have Evernote fail me in this respect. Even with untagged, scanned documents, I can usually use Evernote’s native search capabilities to find what I am looking for within a few seconds.

      One key to this is a habit that I have up modifying the create date of a document I scan to the date on the document itself. So if I scan a story contract on April 30 and the date on the contract is April 15, I will change the create date of the note to April 15. When I am talking with a magazine and they say, “Oh, that was in the contract dated April 15,” I can pull it up easy by jumping to the notes for that day. Same is true for receipts and other statements. Most scanned documents have a date and I find the dates much more useful than tagging.

  3. Hi Jamie,

    I really enjoy your paperless posts and your blog as well!

    Quick question about your digital file organization outside of Evernote… once you have scanned paper to PDF and imported to Evernote, do you keep the PDF files in a local desktop folder? Do you rename them after scanning? How do you organize them?

    I use Evernote a lot, following a lot of your scanning tips, but I also often upload to Evernote from my phone or iPad or via emailing attachments into Evernote, so the ‘origination files’ are often scattered all over and poorly organized, not often reliably backed up. Should I worry about it or not really since it’s all in Evernote (and my hard drive, including the Evernote databases) are backed up regularly with Crashplan. Thanks for your help!

  4. Jamie, I think I’ve become an Evernote enthusiast in part to your work. Recently sought your posts out again as part of a GTD refresh. These tips are really helpful. I’ve now got my scanner to save files to a scanned Dropbox folder and love that they get imported into EN automatically. Thank you for sharing your process!

  5. Hello Jamie

    how would you apply this if you wanted to scan a page in a book? The scanner is nice and compact but how would it do a page or so out of a book that I wanted for a note.

    1. Stan, I think I’d need a different scanner if I was scanning books. So far, in my experience, I haven’t run into the need for scanning from a book. I may want to grab a page here or there, and in that case I just use the Evernote app in my iPhone to take a photo.

  6. When you have your bills come electronically, just curious….do you let the billing company keep the invoice on their website (which is what I’ve been doing) or do you download each invoice into your Evernote? In my business we deal with a lot of utility bills and I’ve just left them on the utility’s website but wonder what you do. Thanks! Totally enjoy your posts!

    1. Jilly, thanks for the kind words. Yes, most banks and other companies maintain the electronic versions on their sites and that works for me. For some bills, I may go in once a quarter and download them. Evernote has a partnership with a product called FileThis Fetch that automates this process. I haven’t used it because I haven’t really had the need, but I’ve heard good things about it.

  7. I recently started scanning a bunch of stuff. There is a setting in HP Solution scanner, for most HP scanners, to scan to a particular directory. Then I have Evernote grab whatever is in that directory and add to Evernote, then delete the file. This works pretty well — it’s direct to evernote with the HP Multifunction printer. The only drawbacks is that all scans are called scan0001.pdf until I rename them in Evernote, and my HP scanner doesn’t do duplex scanning (but little of what I scan is double sided). I use FileThis Fetch to automatically load bills and brokerage and bank statements, that’s as beneficial to me as scanning, and only $20 per year, so well worth it.

    It was a little convoluted to get the HP scanner working, but I didn’t want to get another scanner, but I really wanted direct scanning to Evernote, which I do have now, so that’s good.

    I too am a little confused if I should keep the hard copies or not. I probably keep more than I should. I’ll sometimes add tags, but the search function is good enough usually.

    1. My Brother scanner works the same way. I am curious if ScanSnap or other direct-to-Evernote scanners do better at naming the files when they are imported, or if they, too, have unusable file names.

  8. I take a different position on searchable PDF’s. I always let ScanSnap do the OCR for all my scanned documents. The extra time is not a problem for me. And there are some benefits that outweigh the extra time.

    1.) Exported PDFs (I don’t do often, but if Evernote goes out of business…):
    ScanSnap: The PDF document remains OCR’d if I export it from Evernote.
    Evernote: The PDF document loses its OCR if I export it from Evernote.

    2.) Consistency:
    ScanSnap: The search results are consistent in Evernote, whether I view them from my desktop client or the Evernote web.
    Evernote: The search results are not consistent because Evernote uses different OCR software depending on the platform.

    3.) 100% OCR:
    ScanSnap: Works on notes that are stored in my local non-sync’d Evernote notebooks.
    Evernote: Evernote cannot see my notes on my local non-sync’d notebooks, so the PDF’s cannot be OCR’d.

    4.) No rules:
    ScanSnap: OCR’s all my PDF’s – no rules and I know it is done.
    Evernote: Evernote has 5 complex technical rules to follow and no warning if the document fails to meet all the rules

  9. Not all documents will be OCR’d by Evernote and some of their rules are difficult to understand as shown below.

    The Evernote processor will reject the PDF if any of the following conditions are met:

    1. The PDF contains more than 100 pages

    2. The PDF file is more than 25MB

    3. The PDF does not contain at least one “scanned” page, defined as:

    * A “scanned” page contains at least 1025 pixels of image data
    * A “scanned” page contains no more than 512 characters of regular, searchable text (e.g. this is enough for a text-based fax header or similar). PDF files that have already been processed by a separate OCR system will not satisfy this condition and will be rejected.

    4. The PDF contains no more than one non-scanned page. (i.e. the doc may have one “cover” page without any image data, but if there’s more than one, than it’s not a real scan and we reject it.)

    5. The analysis crashes or fails for some technical reason, typically due to a malformed PDF from some crazy source, or if the PDF is password protected (encrypted).

    6. This analysis process takes more than 30 seconds to complete.

  10. Thanks for the great tips. I use the same scanner and love it. I’ve been using Evernote for 18 months now. The only documents I’ve hesitated to keep in Evernote are banking, investment and tax documents. I’m curious as to what others think/do about this. I think I’m ready…but would appreciate advice first.

  11. Jamie, great post! I work for doo, and I wanted to tell you about the additional ways it might help your document organization. Although your solution seems really effective, have you ever thought about trying doo as well? Scanning documents directly into doo is easy, especially with a Fujitsu ScanSnap (see http://docs.doo.net/scanguide.pdf). And when your docs are scanned into doo they are also OCRed.

    The difference is that doo automatically tags every document you connect to it with the thoroughgoing analysis of our own intelligent algorithms. This technology is so smart that it can identify each document’s relevant information – things like companies, document type, file format, people, places or individual labels – and create smart tags based upon it. Because they are tagged so well, you can search and find any document in no time flat by just entering a few keywords.

    But it’s about more than scanning: doo is a holistic document solution. In addition to all that you scan, connect your local sources (HD, second internal drive, external drive, etc) as well as those in the cloud (Dropbox, Google Drive, Skydrive and email). All of your documents stay right where they are: doo neither moves nor alters any of your documents. What it does do is grant you access to all the documents you have strewn across the cloud or your local devices. Then it gives you the power to find anything by just entering a few keywords – all in one place.

    Again, I really like your solution, and it seems like a great way to get and stay paperless. But if you’re looking to stay paperless, plus manage ever increasing numbers of digital documents, doo may be a nice solution for you.

  12. Jamie, My question on this subject is not answered in your article and that is the resolution of the PDF’s. You put the Snapscan on automatic, but my HP printer/ scanner has not got such a feature.
    I scan on 300 bpi, which seems sufficient. What’s your idea on this?
    Thanks.

  13. I use the same ScanSnap model and find it really easy to use on my own personal paperless journey. I added a video showing how to configure it to scan directly into Evernote a year ago if anyone wants the walk through. http://www.youtube.com/watch?v=DZArRBtPHvU

    The info in one of the comments about using the ScanSnap OCR was very interesting. I haven’t tested it as much as jbenson2 has but I’ll have to turn it on if it isn’t already on. Better safe than sorry and I’m usually not scanning large amounts at one time so the extra time cost doesn’t matter too much.

  14. Regarding how to scan bulky or fixed items like books etc;
    I simply take a photo either with a regular digital camera or smart phone – works fine.
    If from a smartphone you can send or attach straight into evernote with comments etc

  15. Jamie,

    Your paperless posts have inspired me. I purchased a Scansnap ix500 and I am about to embark on decluttering my house, starting with my desk. One problem, though – reading your post, it seems you feed your documents one at a time into the scanner. I was hoping to take a large stack of papers (i.e. multiple docs) and scan them all at once. Then (hopefully) quickly reorganize them into separate docs in Evernote. I was under the impression this would be an easy task using the ScanSnap organizer software, but the only way I have found (so far) is to use Acrobat’s kludgy “split apart” function that requires a tedious process of first bookmarking each document, then going through the function’s dialog to pick where and how everything will be saved. If that’s the only way to do it, then single docs is definitely the better option. Thoughts?

  16. Jaime,

    I truly appreciate your guidance and excellent documentation. I am embarking upon moving financial documents (brokerage statements) to Evernote. I have about 20 years of year-end statements and transactions. As you are most likely familiar, many of these documents have needed data in duplex; however, some pages are packed with repetitive disclosures.

    How can I scan in 10-15 pages in duplex PDF and remove the pages with the unneeded disclosures. I’ve tried right clicking in the images, but it disappears as soon as the scanning is finished.

    I appreciate your willingness to assist.

    Gina

    1. Gina, over the years I’ve found it useful to invest in a good PDF editor like Adobe Acrobat. I sometimes use such an editor to easily remove unwanted pages from PDF scans.

  17. All extremely interesting stuff and I’m now starting on this course of action using the same scanner you are using. I wonder if you can advise on an issue I’ve encountered early on which can’t have escaped you. It concerns using the s1300i when scanning some A4 documents which have been printed in landscape mode. I find that with autorotate turned off obviously the pdf files are the wrong way round for reading and with it turned on the pages are not always auto rotated correctly. So, this gives me problem, how do you deal with multipage A4 landscape printed material which you want as one pdf file, which does not autorotate correctly? Many thanks!

Leave a Reply to George RegneryCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.