Color Depth (FAQ9)
One of the technical questions that must be decided on prior to digitization is the color depth of the digital images to be made. If the sources are photographic, much of the decision making is governed in large part by the type of source material, such as whether the originals are black and white or color.
If the source is textual, scanning the originals as grayscale images possibly allows yellowed paper to become apparent in the background of the image. This can be a disadvantage because this can have an impact on the accuracy level of Optical Character Recognition (OCR) software used to convert the digital images back into text files. It is sometimes found that with when extremely yellowed paper is scanned as grayscale images the background paper is dark enough to be sometimes interpreted by the OCR software as spurious faint text. On the other hand, scanning as bi-tonal images, which could perhaps result in the occasional loss of very faint characters, sometimes results in better overall OCR accuracy because of the removal of the yellowed paper "background noise".
Overall image file size is also related to choice of color depth. A grayscale image of the same original scanned at the same resolution will obviously result in a larger file size than a bi-tonal image. In addition to color depth, resolution is the other factor with an effect on file size. In the U.S., resolution is usually measured in terms of "dots per inch" (d.p.i.). Tests have been run on samples of Cyrillic text scanned at 300 d.p.i. and 600 d.p.i., two of the standard resolution settings available on most scanners, which showed only a marginal difference in the accuracy rate of the test OCR software when using 600 d.p.i. versus 300 d.p.i.
It was found that a test page scanned bi-tonally at 300 d.p.i. yielded an average image file of .48 megabytes while the same page scanned as 12-bit grayscale at 600 d.p.i. yielded an image file of 14.49 megabytes. Use of the Tagged Image File Format (TIFF) for the image files also allows the use of ITU (formerly CCITT) type 4 compression to further decrease the file size.