Digital Library of Appalachia Project
SCANNING GUIDELINES
Scanning is used to create a digital
surrogate of most two-dimensional materials, including printed
documents, manuscripts, drawings, and photographs.
In selecting the appropriate scanning procedure, the
archivist must make a determination on whether it is important
to maintain the look and feel of the original, and whether use
of the material may be improved through various
modifications. In some cases, the library
may decide to make multiple surrogates of a single item to
serve various purposes. Included here are
procedures for (A.)creating text files,
(B.)creating files that preserve the original format of
a document, (C.)creating files that enhance or manipulate the
original format, and (D.)creating image files.
These guidelines are meant to suggest a
process a library could follow in scanning resources for
inclusion in the Digital Library of Appalachia.
Individual users will have to take time to acquaint
themselves with the features of project software so that they
may respond to particular problems.
However, the following procedures should be adequate
for the majority of scanning work undertaken by ACA libraries,
and may be adapted for local use.
A. For text documents
where original format is not significant.
This procedure creates a text file of
documents through optical character recognition
(OCR). The pilot project provided
TextBridge Pro OCR software. Text files are
smaller than image files, and the full text becomes keyword
searchable. Use this scanning procedure for
documents without pictures, and for documents where the
formatting of the text does not add significant editorial
meaning.
1. Make sure computer
and scanner are properly connected and
operational. Place first page of document
in the scanner. Open TextBridge
software. TextBridge provides a
scanning wizard program under the Process menu.
The wizard is recommended for most applications.
2. Use the default
settings for layout (any page) and type (any type) as offered
by the wizard. Do not retain pictures (for
example advertisements on a magazine page that is otherwise
just text) or formatting. Do not ask the
program to proofread (it is easier to do so in a word
processor). When the wizard is finished,
TextBridge will initiate a pre-scan of the document in the
scanner.
3. Check settings for
the scanner. Image type should be set to
Text (Background Removal) and Destination should be set to
OCR. Resolution should be set to 300
dpi.
4. Once settings are
established, select Preview from the scanner
program. The scanner will once again
pre-scan the document. After this
preliminary scan, click and drag with the mouse to select the
page outline. Select scan from the scanner
program.
5. After scanning of one
page is complete, the program provides opportunity to scan
additional pages. Click on the Next Page
button and repeat step 4 until the document is entirely
scanned. After the final page, select No
More Pages.
6. TextBridge saves the
scanned document as Rich Text File (.rtf) by
default. Give the document an appropriate
name, and save to a temporary location.
Close TextBridge.
7. Open the saved .rtf
document in a word processor such as Microsoft
Word. While the OCR program will have
converted most of the text into a usable document, some
proofreading and clean-up work is typically
required. This is especially true when the
program encounters special formatting such as drop caps at the
beginning of paragraphs. Most word
processing programs will help identify misspelled words and
other problems that occur in the OCR process.
When cleanup is complete, save the revised file as
text.
8. Optional. The text file
can be converted to a .pdf file if desired, and if Adobe
Acrobat is installed and operational. This
allows for a universal delivery platform, and is the preferred
format in the Digital Library of Appalachia.
To create a .pdf file from text, simply use the Print
to Acrobat Distiller option on the FilePrint menu of the word
processor.
B. For
multi-page text documents where the original format is
important, and where images are mixed with
text.
This procedure
creates portable document format (.pdf) files of documents
with Adobe Acrobat software provided in the pilot
project. These files can become quite
large, and keyword searching may cumbersome.
Use this scanning procedure for holograph manuscripts,
documents with pictures, and for documents where the
formatting of the text does add significant editorial
meaning. Single page documents may also be
scanned as image files, as described in guideline C
below.
<!--[if
!supportEmptyParas]--> <!--[endif]-->
1. Make sure computer
and scanner are properly connected and
operational. Place first page of document
in the scanner. Open Adobe Acrobat
software. Check the settings by
selecting Tools from the top menu bar, then
Distiller. The job options should be set to
ebook. Click on the Settings tab, and under
Job Options verify that the box for Optimize for fast web view
is checked.
<!--[if
!supportEmptyParas]--> <!--[endif]-->
2. Start the scanning
process from the File menu by selecting Import, then
Scan.
<!--[if
!supportEmptyParas]--> <!--[endif]-->
3. Select the device
(your scanner), the format (single sided works for most
documents), and the destination (new .pdf document) from the
Acrobat scan window. Click on Scan.
4. Acrobat will
initiate a pre-scan of the document. Adjust
the settings for the scanner. The image
type will most often be Color Document or Black & White
Document. You may wish to scan a color
document in black & white to reduce file
sizes. Change resolution to 400 dpi for
pages with photo images. Pages with text
and line drawings may be scanned at 350 dpi.
Avoid skew by placing the originals squarely on the
scanner. Rescan a skewed image rather than rotating it
after scanning.
5. Once scanner settings
are established, click and drag with the mouse over the
preview image to select the page outline.
Select Scan from the scanner program.
6. After scanning of one
page is complete, the program provides opportunity to scan
additional pages. Click on the Next button
and repeat step 5 until the document is entirely
scanned. After the final page, select
Done.
7. The document will
display in the Acrobat window on the screen.
Page through the document to verify that all pages are
present and properly scanned. Pages may be
deleted, inserted, or otherwise modified for correction using
the options on the Document menu from the top menu
bar.
8. Compress the file
with the Distiller. From the File menu,
select Print, and select Acrobat Distiller as the
printer. Assign an appropriate name and
location to the file when prompted.
C. For text documents
where some manipulation of the image is
important.
This procedure creates portable document
format (.pdf) files of documents with Adobe Photoshop and
Acrobat software provided in the pilot project.
These files can become quite large, and keyword
searching may cumbersome. Use this scanning
procedure for documents that require some
manipulation. For example, newspaper text
may need to have images sharpened or contrast adjusted to
improve legibility. Some text may have
foxing or staining to pages, and require additional
touch-up.
1. Make sure computer
and scanner are properly connected and
operational. Place first page of document
in the scanner. Open Adobe Photoshop
software. Note:
At least one library preferred Adobe ImageReady to
Photoshop because of accuracy of cross-hairs in selection
tool.
2. Start the scanning
process from the File menu by selecting Import, then Epson
Twain (or name of scanner in use).
3. Photoshop will
initiate a pre-scan of the document. Adjust
the settings for the scanner. The image
type will most often be Color Document or Black & White
Document. Change resolution to 400 dpi for
pages with photo images. Pages with text
and line drawings may be scanned at 350 dpi.
Avoid skew by placing the originals squarely on the
scanner. Rescan a skewed image rather than rotating it
after scanning.
4.
Once
scanner settings are established, click and drag with the
mouse over the preview image to select the page
outline.
Click on Scan from the scanner
program.
5. When page has been
scanned, close scanner software. Use
Photoshop tools to manipulate image as needed.
6. To scan additional
pages, repeat steps 2-5. All pages will
remain visible in Photoshop as separate documents.
7. Begin creating
Acrobat file from Photoshop by selecting image window with the
first page of the document. From the File
menu, print this page to Acrobat Distiller.
Assign an appropriate name and location to save the
file as prompted.
8. Close the first page
image in Photoshop. Do not save changes
here they are saved in Acrobat. Select the
second page image in Photoshop and print this page to Acrobat
Distiller as above. Assign it a temporary
name and save to the desktop. Repeat for
any additional pages.
9. Assemble pages into a
single file in Acrobat. Open the .pdf file
of the first page of the document as saved in step
7. From the Document menu on the top menu
bar, select Insert Pages. Select the file
for the second page temporarily saved to the desktop in step
8. Repeat for any additional pages.
10. Once all pages have
been added, save .pdf file with appropriate name and
location.
D. For photographs,
illustrations, maps, and other items that are chiefly
images.
This procedure creates image files (.jpg
and .tif) of items with Adobe Photoshop (or alternatively
Image Ready) software provided in the pilot
project. These files can become quite
large, and are usually prepared for web delivery in a reduced
.jpg format.. Use this scanning procedure
for pictures that require some manipulation.
For example, photographs may need adjustments in
contrast, brightness, and color levels.
1. Make sure computer and scanner are
properly connected and operational. Place
document in the scanner. Open Adobe
Photoshop software. Note:
At least one library preferred Adobe ImageReady to
Photoshop because of accuracy of cross-hairs in selection
tool.
2. Start the scanning
process from the File menu by selecting Import, then Epson
Twain (or name of scanner in use). [For
pictures taken with digital camera, use the FileOpen command
and proceed to step 5.]
3. Photoshop will
initiate a pre-scan of the document. Adjust
the settings for the scanner. The image
type will most often be Color Photograph or Black & White
Photograph. Change resolution to 600 dpi.
Line drawings may be scanned at 400 dpi. We
start with a fairly high resolution scan, because it is always
possible to resample an image downward to decrease file size,
but resolution cannot be improved without
rescanning. Avoid skew by placing the
originals squarely on the scanner. Rescan a skewed image
rather than rotating it after scanning.
4.
Once
scanner settings are established,
click and
drag with the mouse to select the page outline.
Click on Scan from the scanner program.
5. When page has been
scanned, close scanner software. Use
Photoshop tools to manipulate image as needed.
Most frequently used options are available from the
Image menu on the top menu bar.
6. Save the file as in
.tif format, with an appropriate name and location, for
archival purposes.
7. Reduce the file for
web delivery. From the Image menu, select
Image Size and click on the Auto option.
Choose Best quality for the image.
8. The image will appear
re-sized in Photoshop. From the File menu,
select Save for Web. This will bring up an
alternative view of the image in Photoshop as it will appear
in .jpg format. Settings for the image
should be set at JPEG High. Again, assign
an appropriate name and location. This is
the file to be used in the DLA
database.