Scanner with automatic document feed (ADF)- Fujitsu ScanSnap FI-5110EOX2: Perfect scripting with scanadf/convert - generating least-size PDF/JPEG-output - managing large documents

From openSUSE

Topic

Frontends like xsane/kooka are great for occasional scanning, i.e. if

  • size of output does not matter
    • both programs produce default output about ten to thirty times bigger than necessary/sufficient
  • You don't depend on certain output-formats and
  • You only have a small number of documents and a limited amount of paper for each document
  • You feel better with a GUI ;-)

If any of the above don't apply the following article gives a descriptive example and built-to-run shell-script for

  • using the command-line to
  • perfectly control a document-scanner like for instance a Fujitsu ScanSnap and
  • produce all flavors between least-size- or best-quality scans in a single document file of either pdf- or jpeg-format
  • manage large documents, that is, keep the order of pages !

Prerequisites

From SuSE-DVD install

  • ImageMagick
  • sane-backends
  • sane-frontends (supposedly optional, not tested)

Hint: If You don't have an ADF-(Automatic Document Feed)-Scanner, You might use

  • scanimage instead of
  • scanadf (as mentioned in the text below)

Provided You have

  • installed the above
  • connected Your scanner
  • determined its sane-backend-parameters with
    • scan[image|adf] -L
    • scan[image|adf] -h -d <technical name as given by the former command>
      • example: scanadf --help --device fujitsu:libusb:002:018
  • adapted the following script accordingly

You can now

  • very conveniently scan with all possible parameters via a single command-line-statement while
  • keeping the order of pages in large outputs
  • obtaining exactly the amount of size and quality You want and
  • freely controlling the output-format

with the following script:

Example calls (provided the script is named myscan):

  • myscan -d -n my_small_doubleside_grey_pdf
  • myscan -c 60 -C -n my_big_singleside_color_jpeg -j

The other parameters normally either don't need adaptation or should be provided with appropriate default-values inside Your script after the first test-scans.

#!/bin/bash

usage="Usage: $0 [-c <compress>] [-d(uplex)] [-C(olor)] [-x <resolution>] [-y <resolution>] [-n <name>] [-j(peg)]"  

#-------------------------------------------------------------------------------
# For the following parameters compare the sane-backend output of Your device as given by the command
# - scan[image|adf] -L 
# - scan[image|adf] -h -d <technical name as given by the former command>
# and adapt possible names and values accordingly 
#-------------------------------------------------------------------------------

xres=100
yres=100
name=noname
mode="Gray"
pagewidth="220"
pageheight="295"
ulx=5
uly=0
lrx=210
lry=295
compress=40
format="pdf"  # Default is "pdf"
source="ADF Front"

while getopts "Cc:dx:y:n:j" opt; do
  case $opt in
    C  ) mode="Color" ;;
    d  ) source="ADF Duplex" ;;
    c  ) compress=$OPTARG ;;
    x  ) xres=$OPTARG ;;
    y  ) yres=$OPTARG ;;
    n  ) name=$OPTARG ;;
    j  ) format="jpeg" ;;
    \? ) echo $usage
         exit ;;
  esac   
done

echo "Source $source"
echo "Mode $mode"
echo "Compression $compress"
echo "Horizontal resolution $xres"
echo "Vertical resolution $yres"
echo "Name $name"
echo "Format $format"
#-------------------------------------------------------------------------------
# Scan
#-------------------------------------------------------------------------------
scanadf -l $ulx -t $uly -x $lrx -y $lry --source "$source" --resolution $xres --y-resolution $yres --mode $mode --pagewidth $pagewidth --pageheight $pageheight -o scan%d
#-------------------------------------------------------------------------------
# Concatenation while keeping oder of pages !
#   lexical sort would mess up proper sequence
# 
# Hint: An alternative command for concatenating the default-output would be
# pnmcat -tb scan[0-9] > scanAllInOne
#-------------------------------------------------------------------------------
numScans=`ls scan[0-9]* | wc -w`
if [ $numScans -gt 0 ]; then
  echo "Concatenating $numScans pages"
  if [ $numScans -lt 10 ]; then
    convert scan[0-9] -append scanAllInOne
  elif [ $numScans -lt 20 ]; then
    convert scan[0-9] scan1[0-9] -append scanAllInOne
  elif [ $numScans -lt 30 ]; then
    convert scan[0-9] scan1[0-9] scan2[0-9] -append scanAllInOne
  elif [ $numScans -lt 40 ]; then
    convert scan[0-9] scan1[0-9] scan2[0-9] scan3[0-9] -append scanAllInOne
  elif [ $numScans -lt 50 ]; then
    convert scan[0-9] scan1[0-9] scan2[0-9] scan3[0-9] scan4[0-9] -append scanAllInOne
  elif [ $numScans -lt 60 ]; then
    convert scan[0-9] scan1[0-9] scan2[0-9] scan3[0-9] scan4[0-9] scan5[0-9] -append scanAllInOne
  elif [ $numScans -lt 70 ]; then
    convert scan[0-9] scan1[0-9] scan2[0-9] scan3[0-9] scan4[0-9] scan5[0-9] scan6[0-9] -append scanAllInOne
  elif [ $numScans -lt 80 ]; then
    convert scan[0-9] scan1[0-9] scan2[0-9] scan3[0-9] scan4[0-9] scan5[0-9] scan6[0-9] scan7[0-9] -append scanAllInOne
  elif [ $numScans -lt 90 ]; then
    convert scan[0-9] scan1[0-9] scan2[0-9] scan3[0-9] scan4[0-9] scan5[0-9] scan6[0-9] scan7[0-9] scan8[0-9] -append scanAllInOne
  elif [ $numScans -lt 100 ]; then
    convert scan[0-9] scan1[0-9] scan2[0-9] scan3[0-9] scan4[0-9] scan5[0-9] scan6[0-9] scan7[0-9] scan8[0-9] scan9[0-9] -append scanAllInOne
  else
    echo "More pages than 100 ? -> Exit"
    exit 1
  fi
  #-------------------------------------------------------------------------------
  # compression, changing output-format, clean-up and verification of result
  #-------------------------------------------------------------------------------
  if [ $format = "pdf" ]; then
    if convert scanAllInOne -compress jpeg -quality ${compress}% ${name}.pdf
    then
      rm scan*
      xpdf ${name}.pdf
    fi
  else
    if convert scanAllInOne -quality ${compress}% ${name}.jpg
    then
      rm scan*
      gwenview ${name}.jpg
    fi
  fi 
else
  echo "found no scans"
fi