FAQ8
Scan resolution (FAQ8)
Resolution is a factor that has a direct effect on file size. In the U.S., resolution is usually measured in terms of "dots per inch" (d.p.i.). Tests have been run on samples of Cyrillic text scanned at 300 d.p.i. and 600 d.p.i., two of the standard resolution settings available on most scanners, which showed only a marginal difference in the accuracy rate of the test OCR software when using 600 d.p.i. versus 300 d.p.i.
In the case of a Russian language text file, it was found that a test page scanned bi-tonally at 300 d.p.i. yielded an average image file of .48 megabytes while the same page scanned as 12-bit grayscale at 600 d.p.i. yielded an image file of 14.49 megabytes. Use of the Tagged Image File Format (TIFF) for the image files also allows the use of ITU (formerly CCITT) type 4 compression to further decrease the file size.
If digital text files are the only result needed, 300 d.p.i. would perhaps be adequate and would have the advantage of smaller image file size. In addition the time per image to scan an original is also dependent on the resolution required. Hence in large projects with large numbers of images being created scanning at higher resolution can significantly increase the labor costs for scanning. Also if outsourcing of the scanning is to be considered, for some vendors the cost of digitization is dependent on the resolution required. Thus, financial considerations enter once again since 300 d.p.i. could also be cheaper than 600 d.p.i on a per page or per image basis.
However in the case of digital page images, if it should be decided that these images may someday be used to also produce new paper replica copies of the source, it may be better to use the higher resolution of 600 d.p.i. in order to have higher quality images from which to produce paper copies directly. In spite of using the higher resolution, use of ITU type 4 file compression in conjunction with bi-tonal color depth, could still achieve file sizes for an average page image of text on the order of 100 kilobytes.