From: Alfie Costa (agcosta@gis.net)
Date: Fri Mar 10 2000 - 19:15:12 CET
Here's another 'wc' script.  It's based on the old mu-awk one. 
Advantages over the old 'wc': 1) It understands the various 'wc' command line 
switches, in most any order.  The output looks like GNU wc, with totals and 
hyphens.  There's four columns, from left to right: lines, words, chars, 
filename.  Lines, words and chars are output in the same columns every time, 
which is hopefully more intuitively obvious than letting them slide leftward.  
For programs calling 'wc' for output, it shouldn't matter as the columns are 
only whitespace. 
2) Takes standard IO or file lists, or both.  Standard IO can also be called 
with a hyphen. 
3) Better character counts.  Attached to this message is 'nullfile.gz', which 
is 170 bytes compressed, 128K of nulls uncompressed.  The old mu-awk 'wc' 
doesn't understand that kind of file.  
Disadvantages: 1) Bigger, can't be helped, more features.  Chopping out the 
comments or abbreviating the variable names could reduce it some. 
2) Uglier, it's another ash script, and has too many if-thens.  There are weird 
kludges.  Any hints?  A getopts command would probably help.  (getopts might 
make an interesting script.) 
3) If you only want to count words and lines, it's as fast as the old one: it's 
the same code.  If you want to count chars from standard IO, it's slower 
because it writes a temp file, and does an 'ls -o', and gets the temp file size 
which is the same thing as counting the chars.  Is there a better way?  (In 
ash, that is.)  For named files, it's not too bad, as it doesn't need to make 
any temp files.  It still runs 'ls' once per file, which might be improved. 
4) Because the line count and word count is the same ash code:  On a big binary 
file, awk may say it has only 1 word and 1 line; this seems unlikely.  Haven't 
tested this. 
Differences: 
1) GNU wc gives an error message if you ask it to count a directory.  When my 
script sees a directory, it simply ignores it. 
2) The command line options can be in any order.  Example: 
'ls -l | wc foo.txt t* -l - "/mnt/c/win/name with spaces.exe" ' 
This will count lines (-l), of foo.txt, t*, standard IO, (which would be the 
output of 'ls -l'), and finally a vfat (or Win95) filename with spaces.  Then 
it gives a total. 
The Code: 
Various kludges, tricks, or whatever may be worth mentioning... 
There's some functions which are supposed to make things more compact and 
easier to read.  Not sure if they really do. 
The command line parsing is complicated.  There's two 'for' loops, the first 
one looks for switches with hyphens, and sets variables.  It checks each and 
every argument on the command line.  This loop also checks for one error, as 
well as for the help switch.  After the first loop, the script checks if any of 
the flags are set, if they're not, it sets three of them.  The second loop uses 
the 'case' statement to look for anything that's more than one character and 
begins with a hyphen.  The switches all get dumped, everything else is assumed 
to be a filename and is kept.  There are quotes within quotes to preserve vfat 
filenames, which can have spaces in them. 
Here's the quotes voodoo: 
Z*) opts="$opts"" "\""$b"\" ;; 
This is what decodes it: 
eval set dummyoption $opts 
shift 
I couldn't get it to work without the 'eval', but there may be a way.  The 
'dummyoption' and 'shift' is a kludge.  It may happen that the first option is 
a hyphen or '-', which has a special meaning to 'set', but only when the hyphen 
is the first thing 'set' sees on the command line.  The 'dummyoption' nullifies 
that special meaning. 
Here's another odd bit: 
if [ -z "$checkC$checkW$checkL" ] 
This checks if all three variables are nulls.  It's shorter than having 3 sets 
of '-z's and '||'s. 
This line is interesting: 
filename=${NoFiles:-"-"} 
It's for when there's standard IO input.  If there's a hyphen on the command 
line, a "-" should show up in the filename column.  If there's no hyphen, then 
the filename column should be blank.  $NoFiles is set earlier to a space if 
there's no hyphen (and no files), otherwise it's a null.  The above line is the 
same thing as: 
if [ -z "$NoFiles" ]	# if NoFiles is a null 
then filename="-" 
else filename=" " 
fi 
...only it's one line long, but harder to read of course. 
Bugs: 
Impossible!  Maybe! 
Attachments: 
  C:\LINUX\ROOT\Crudewc 
  C:\LINUX\ROOT\Nullfile.gz 
#!/bin/ash
# rustique wc (3/8/00 by A. Costa)
# writes a temp file to count stdIO chars, uses awk and ls...
# (NB: Currently formatted to 4 spaces per tab.)
# Functions
Help()
{
echo "Usage: wc [-clw | -a] [filename]"
exit
}
CleanUp()	# get rid of temp files if necessary
{ 
[ -w "$stdIOfile" ]  &&  rm $stdIOfile 2>/dev/null 
}
Bail()
{
CleanUp
exit 2
}
ShowLine()	# syntax: Showline lines# words# chars# filename
{ 
echo $1:$2:$3:"$4" | awk -F: '{printf "%7s%10s%12s   %s\n", $1, $2, $3, $4}'
}
CheckHyphen()
{
if [ $hyphen ]	# Chastise user?...
then
    echo "error: only one stdIO hyphen allowed." >& 2
    Bail
fi
hyphen=0
}
#Parse options...
for b in "$@"	# Pass 1, get options, wherever they are...
do
    case "Z$b" in
        Z-) CheckHyphen;;
        Z-d) set -x;;			# debug mode
        Z-a) checkC=0 checkL=0 checkW=0;;
        Z-c) checkC=0;;
        Z-w) checkW=0;;
        Z-l) checkL=0;;
        Z-cw|Z-wc) checkC=0 checkW=0;;
        Z-cl|Z-lc) checkC=0 checkL=0;;
        Z-lw|Z-wl) checkL=0 checkW=0;;
        Z-h|Z-?*) Help ;;
        Z*) ;;
    esac
done
if [ -z "$checkC$checkW$checkL" ]	# no options?
then
    checkC=0 checkL=0 checkW=0	# the default
fi 
for b in "$@"	# Pass 2, remove all options from command line...
do
    case "Z$b" in
        Z-?*) ;;
        Z*) opts="$opts"" "\""$b"\" ;;	# for vfat filenames with spaces
    esac
done
                                # 'eval' is needed to parse $opts
eval set dummyoption $opts 	# new commandline has no switches...
shift				# remove dummyoption
if [ "$1" = "" ]	 	# no filenames?  Use standard I/O
then
    NoFiles=" "	 		# the filename of no file
    set dummyoption - ; shift
fi
for f in "$@"
do
    if [ "$f" = "-" ] 		# stdIO?
    then
        trap 'CleanUp' 1 2 3 15
        stdIOfile=/tmp/$$RusticWc.tmp
        if  cat > $stdIOfile
        then			# all's well
            f=$stdIOfile
        else
            Bail
        fi
        filename=${NoFiles:-"-"}	# display a hyphen or not?
    else
        if [ ! -r "$f" ]		# is the file readable?
        then
            echo "error: can't read \"$f\" " >& 2
            exit 2
        else			# skip any directories...
            [ -d "$f" ] && continue	
        fi
        filename="$f"
    fi
    # get how many chars it is..
    if [ "$checkC" = "0" ]
    then
        c=`ls -o "$f" | awk '{ print $4 }'`
        cSum=`expr "$cSum" + $c`
    fi
    # check words and lines.
    # this first "if-then" is a wrapper, so awk is only called once per file.
    if [ "$checkW$checkL" -eq 0 ] 
    then   
        TmpWL=`cat "$f" | awk 'BEGIN { w=0 } { w+=NF } END  { print NR, w }'`
        if [ "$checkW" = "0" ] 
        then
            w=`echo $TmpWL | awk '{ print $2 }'`
            wSum=`expr "$wSum" + $w`
        fi
        if [ "$checkL" = "0" ] 
        then
            lines=`echo $TmpWL | awk '{ print $1 }'`
            linesSum=`expr "$linesSum" + $lines`
        fi
    fi 
 
    ShowLine "$lines" "$w" "$c" "$filename"
    n=`expr "$n" + 1`
done
[ "$n" -gt "1" ]  &&  ShowLine "$linesSum" "$wSum" "$cSum" total
CleanUp--Message-Boundary-1486
Content-type: text/plain; charset=US-ASCII
Content-disposition: inline
Content-description: Attachment information.
The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any another MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.
   ---- File information -----------
     File:  Nullfile.gz
     Date:  5 Mar 2000, 12:41
     Size:  170 bytes.
     Type:  Unknown
---------------------------------------------------------------------
To unsubscribe, e-mail: mulinux-unsubscribe@sunsite.auc.dk
For additional commands, e-mail: mulinux-help@sunsite.auc.dk
This archive was generated by hypermail 2.1.6 : Sat Feb 08 2003 - 15:27:13 CET