The expression $(($OPTIND - 1)) in the last example gives a clue as to how the shell can do integer arithmetic. As you might guess, the shell interprets words surrounded by $(( and )) as arithmetic expressions. Variables in arithmetic expressions do not need to be preceded by dollar signs. It is OK to supply the dollar sign, except when assigning a value to a variable.
Arithmetic expressions are evaluated inside double quotes, like variables and command substitutions. We're finally in a position to state the definitive rule about quoting strings: When in doubt, enclose a string in single quotes, unless it contains any expression involving a dollar sign, in which case you should use double quotes.
For example, the date(1) command on modern versions of Unix accepts arguments that tell it how to format its output. The argument +%j tells it to print the day of the year, i.e., the number of days since December 31st of the previous year.
We can use +%j to print a little holiday anticipation message:
print "Only $(( (365-$(date +%j)) / 7 )) weeks until the New Year!"
We'll show where this fits in the overall scheme of command-line processing in Chapter 7.
The arithmetic expression feature is built in to the Korn shell's syntax, and it was available in the Bourne shell (most versions) only through the external command expr(1). Thus it is yet another example of a desirable feature provided by an external command (i.e., a syntactic kludge) being better integrated into the shell. [[...]] and getopts are also examples of this design trend.
While expr and ksh88 were limited to integer arithmetic, ksh93 supports floating-point arithmetic. As we'll see shortly, you can do just about any calculation in the Korn shell that you could do in C or most other programming languages.
Korn shell arithmetic operators are equivalent to their counterparts in the C language. Precedence and associativity are the same as in C. (More details on the Korn shell's compatibility with the C language may be found in Appendix B; said details are of interest mostly to people already familiar with C.) Table 6-2 shows the arithmetic operators that are supported, in order from highest precedence to lowest. Although some of these are (or contain) special characters, there is no need to backslash-escape them, because they are within the $((...)) syntax.
Operator | Meaning | Associativity |
---|---|---|
++ -- | Increment and decrement, prefix and postfix | Left to right |
+ - ! ~ | Unary plus and minus; logical and bitwise negation | Right to left |
** | Exponentiation[84] |
Right to left |
* / % | Multiplication, division, and remainder | Left to right |
+ - | Addition and subtraction | Left to right |
<< >> | Bit-shift left and right | Left to right |
< <= > >= | Comparisons | Left to right |
== != | Equal and not equal | Left to right |
& | Bitwise and | Left to right |
^ | Bitwise exclusive-or | Left to right |
| | Bitwise or | Left to right |
&& | Logical and (short circuit) | Left to right |
|| | Logical or (short circuit) | Left to right |
?: | Conditional expression | Right to left |
= += -= *= /= %= &= ^= <<= >>= |
Assignment operators | Right to left |
, | Sequential evaluation | Left to right |
[84] ksh93m and newer. The ** operator is not in the C language.
Parentheses can be used to group subexpressions. The arithmetic expression syntax (like C) supports relational operators as "truth values" of 1 for true and 0 for false.
For example, $((3 > 2)) has the value 1; $(( (3 > 2) || (4 <= 1) )) also has the value 1, since at least one of the two subexpressions is true.
If you're familiar with C, C++ or Java, the operators listed in Table 6-2 will be familiar. But if you're not, some of them warrant a little explanation.
The assignment forms of the regular operators are a convenient shorthand for the more conventional way of updating a variable. For example, in Pascal or Fortran you might write x = x + 2 to add 2 to x. The += lets you do that more compactly: $((x += 2)) adds 2 to x and stores the result back in x. (Compare this to the recent addition of the += operator to ksh93 for string concatenation.)
Since adding and subtracting 1 are such frequent operations, the ++ and -- operators provide an even more abbreviated way to do them. As you might guess, ++ adds 1, while -- subtracts 1. These are unary operators. Let's take a quick look at how they work.
$ i=5 $ print $((i++)) $i 5 6 $ print $((++i)) $i 7 7
What's going on here? In both cases, the value of i is increased by one. But the value returned by the operator depends upon its placement relative to the variable being operated upon. A postfix operator (one that occurs after the variable) returns the variable's old value as the result of the expression and then increments the variable. By contrast, a prefix operator, which comes in front of the variable, increments the variable first and then returns the new value. The -- operator works the same as ++, but it decrements the variable by one, instead of incrementing it.
The shell also supports base N numbers, where N can be up to 64. The notation B#N means "N base B." Of course, if you omit the B#, the base defaults to 10. The digits are 0-9, a-z (10-35), A-Z (36-61), @ (62), and _ (63). (When the base is less than or equal to 36, you may use mixed case letters.) For example:
$ print the ksh number 43#G is $((43#G)) the ksh number 43#G is 42
Interestingly enough, you can use shell variables to contain subexpressions, and the shell substitutes the value of the variable when doing arithmetic. For example:
$ almost_the_answer=10+20 $ print $almost_the_answer 10+20 $ print $(( almost_the_answer + 12 )) 42
The shell provides a number of built-in arithmetic and trigonometric functions for use with $((...)). They are called using C function call syntax. The trigonometric functions expect arguments to be in radians, not in degrees. (There are 2 radians in a circle.) For example, remembering way back to high-school days, recall that 45 degrees is divided by 4. Let's say we need the cosine of 45 degrees:
$ pi=3.1415927 Approximate value for pi $ print the cosine of pi / 4 is $(( cos(pi / 4) )) the cosine of pi / 4 is 0.707106772982
A better approximation of may be obtained using the atan function:
pi=$(( 4. * atan(1.) )) A better value for pi
Table 6-3 lists the built-in arithmetic functions.
Function | Returns | Function | Returns |
---|---|---|---|
abs | Absolute value | hypot[85] |
Euclidean distance |
acos | Arc cosine | int | Integer part |
asin | Arc sine | log | Natural logarithm |
atan | Arc tangent | pow[85] | Exponentiation (xy) |
atan2[85] | Arc tangent of two variables | sin | Sine |
cos | Cosine | sinh | Hyperbolic sine |
cosh | Hyperbolic cosine | sqrt | Square root |
exp | Exponential (ex) | tan | Tangent |
fmod[85] | Floating-point remainder | tanh | Hyperbolic tangent |
[85] Added in ksh93e.
Another construct, closely related to $(( ...)), is ((...)) (without the leading dollar sign). We use this for evaluating arithmetic condition tests, just as [[...]] is used for string, file attribute, and other types of tests.
((...)) is almost identical to $((...)). However, it was designed for use in if and while constructs. Instead of producing a textual result, it just sets its exit status according to the truth of the expression: 0 if true, 1 otherwise. So, for example, ((3 > 2)) produces exit status 0, as does (( (3 > 2) || (4 <= 1) )), but (( (3 > 2) && (4 <= 1) )) has exit status 1 since the second subexpression isn't true.
You can also use numerical values for truth values within this construct. It's like the analogous concept in C: a value of 0 means false (i.e., returns exit status 1), and a non-zero value means true (returns exit status 0), e.g., (( 14 )) is true. See the code for the kshdb debugger in Chapter 9 for more examples of this.
The ((...)) construct can also be used to define numeric variables and assign values to them. The statement:
(( var=expression ))
creates the numeric variable var (if it doesn't already exist) and assigns to it the result of expression.
The double-parentheses syntax is what's recommended. However, if you prefer to use a command for doing arithmetic, the shell provides one: the built-in command let. The syntax is:
let var=expression
It is not necessary (because it's actually redundant) to surround the expression with $(( and )) in a let statement. When not using quotes, there must not be any space on either side of the equal sign (=). However, it is good practice to surround expressions with quotes, since many characters are treated as special by the shell (e.g., *, #, and parentheses); furthermore, you must quote expressions that include whitespace (spaces or TABs). See Table 6-4 for examples. Once you have quotes, you can use spaces:
let "x = 3.1415927" "y = 1.41421"
While ksh88 only allowed you to use integer variables, ksh93 no longer has this restriction, and variables may be floating point as well. (An integer is what was called a "whole number" in school, a number that doesn't have a fractional part, such as 17 or 42. Floating-point numbers, in contrast, can have fractional parts, such as 3.1415927.) The shell looks for a decimal point in order to determine that a value is floating point. Without one, values are treated as integers. This is primarily an issue for division: integer division truncates any fractional part. The % operator requires an integer divisor.
The shell provides two built-in aliases for declaring numeric variables: integer for integer variables and float for floating point variables. (These are both aliases for the typeset command with different options. More details are provided in Section 6.5.3, later in this chapter.)
Finally, all assignments to both integer and floating-point variables are automatically evaluated as arithmetic expressions. This means that you don't need to use the let command:
$ integer the_answer $ the_answer=12+30 $ print $the_answer 42
Assignment | Value |
---|---|
let x= | $x |
x=1+4 | 5 |
'x = 1 + 4' | 5 |
'x = 1.234 + 3' | 4.234 |
'x = (2+3) * 5' | 25 |
'x = 2 + 3 * 5' | 17 |
'x = 17 / 3' | 5 |
'x = 17 / 3.0' | 5.66666666667 |
'17 % 3' | 2 |
'1 << 4' | 16 |
'48 >> 3' | 6 |
'17 & 3' | 1 |
'17 | 3' | 19 |
'17 ^ 3' | 18 |
Task 6-1 is a small task that makes use of arithmetic.
We'll make our option -N, a la head. The syntax for this single option is so simple that we need not bother with getopts. Here is the code:
if [[ $1 == -+([0-9]) ]]; then (( page_lines = ${1#-} )) shift else (( page_lines = 66 )) fi let file_lines="$(wc -l < $1)" (( pages = file_lines / page_lines )) if (( file_lines % page_lines > 0 )); then (( pages++ )) fi print "$1 has $pages pages of text."
Note that we use the arithmetical conditional (( file_lines % page_lines > 0 )) rather than the [[...]] form.
At the heart of this code is the Unix utility wc(1), which counts the number of lines, words, and characters (bytes) in its input. By default, its output looks something like this:
8 34 161 bob
wc's output means that the file bob has 8 lines, 34 words, and 161 characters. wc recognizes the options -l, -w, and -c, which tell it to print only the number of lines, words, or characters, respectively.
wc normally prints the name of its input file (given as argument). Since we want only the number of lines, we have to do two things. First, we give it input from file redirection instead, as in wc -l < bob instead of wc -l bob. This produces the number of lines preceded by one or more spaces.
Unfortunately, that space complicates matters: the statement let file_lines=$(wc -l < $1) becomes let file_lines= N after command substitution; the space after the equal sign is an error. That leads to the second modification, the quotes around the command substitution expression. The statement let file_lines=" N" is perfectly legal, and let knows how to remove the leading space.
The first if clause in the pages script checks to see if the first command line argument is an option. If so, it strips the dash (-) off and assigns the result to the variable page_lines. wc in the command substitution expression returns the number of lines in the file whose name is given as argument.
The next group of lines calculates the number of pages and, if there is a remainder after the division, adds 1. Finally, the appropriate message is printed.
As a bigger example of arithmetic, we now complete our version of the C shell's pushd and popd functions (Task 4-7). Remember that these functions operate on DIRSTACK, a stack of directories represented as a string with the directory names separated by spaces. The C shell's pushd and popd take additional types of arguments:
pushd +n takes the nth directory in the stack (starting with 0), rotates it to the top, and cds to it.
pushd without arguments doesn't complain; instead, it swaps the two top directories on the stack and cds to the new top.
popd +n takes the nth directory in the stack and just deletes it.
The most useful of these features is the ability to get at the nth directory in the stack. Here are the latest versions of both functions:
function pushd { # push current directory onto stack dirname=$1 if [[ -d $dirname && -x $dirname ]]; then cd $dirname DIRSTACK="$dirname DIRSTACK" print "$DIRSTACK" else print "still in $PWD." return 1 fi } function popd { # pop directory off the stack, cd there if [[ -n $DIRSTACK ]]; then top=${DIRSTACK%% *} DIRSTACK=${DIRSTACK#* } cd $top print "$PWD" else print "stack empty, still in $PWD." return 1 fi }
To get at the nth directory, we use a while loop that transfers the top directory to a temporary copy of the stack n times. We'll put the loop into a function called getNdirs that looks like this:
function getNdirs { stackfront='' let count=0 while (( count < $1 )); do stackfront="$stackfront ${DIRSTACK%% *}" DIRSTACK=${DIRSTACK#* } let count++ done }
The argument passed to getNdirs is the n in question. The variable stackfront is the temporary copy that contains the first n directories when the loop is done. stackfront starts as null; count, which counts the number of loop iterations, starts as 0.
The first line of the loop body appends the top of the stack (${DIRSTACK%% *}) to stackfront; the second line deletes the top from the stack. The last line increments the counter for the next iteration. The entire loop executes n times, for values of count from 0 to n-1.
When the loop finishes, the last directory in $stackfront is the nth directory. The expression ${stackfront##* } extracts this directory. Furthermore, DIRSTACK now contains the "back" of the stack, i.e., the stack without the first n directories. With this in mind, we can now write the code for the improved versions of pushd and popd:
function pushd { if [[ $1 == ++([0-9]) ]]; then # case of pushd +n: rotate n-th directory to top num=${1#+} getNdirs $num newtop=${stackfront##* } stackfront=${stackfront%$newtop} DIRSTACK="$newtop $stackfront $DIRSTACK" cd $newtop elif [[ -z $1 ]]; then # case of pushd without args; swap top two directories firstdir=${DIRSTACK%% *} DIRSTACK=${DIRSTACK#* } seconddir=${DIRSTACK%% *} DIRSTACK=${DIRSTACK#* } DIRSTACK="$seconddir $firstdir $DIRSTACK" cd $seconddir else # normal case of pushd dirname dirname=$1 if [[ -d $dirname && -x $dirname ]]; then cd $dirname DIRSTACK="$dirname $DIRSTACK" print "$DIRSTACK" else print still in "$PWD." return 1 fi fi } function popd { # pop directory off the stack, cd to new top if [[ $1 == ++([0-9]) ]]; then # case of popd +n: delete n-th directory from stack num=${1#+} getNdirs $num stackfront=${stackfront% *} DIRSTACK="$stackfront $DIRSTACK" else # normal case of popd without argument if [[ -n $DIRSTACK ]]; then top=${DIRSTACK%% *} DIRSTACK=${DIRSTACK#* } cd $top print "$PWD" else print "stack empty, still in $PWD." return 1 fi fi }
These functions have grown rather large; let's look at them in turn. The if at the beginning of pushd checks if the first argument is an option of the form +N. If so, the first block of code is run. The first statement simply strips the plus sign (+) from the argument and assigns the result -- as an integer -- to the variable num. This, in turn, is passed to the getNdirs function.
The next two assignment statements set newtop to the nth directory -- i.e., the last directory in $stackfront -- and delete that directory from stackfront. The final two lines in this part of pushd put the stack back together again in the appropriate order and cd to the new top directory.
The elif clause tests for no argument, in which case pushd should swap the top two directories on the stack. The first four lines of this clause assign the top two directories to firstdir and seconddir and delete these from the stack. Then, as above, the code puts the stack back together in the new order and cds to the new top directory.
The else clause corresponds to the usual case, where the user supplies a directory name as argument.
popd works similarly. The if clause checks for the +N option, which in this case means delete the Nth directory. num receives the integer count; the getNdirs function puts the first N directories into stackfront. Then the line stackfront=${stackfront% *} deletes the last directory (the Nth directory) from stackfront. Finally, the stack is put back together with the Nth directory missing.
The else clause covers the usual case, where the user doesn't supply an argument.
Before we leave this subject, here are a few exercises that should test your understanding of this code:
Add code to pushd that exits with an error message if the user supplies no argument and the stack contains fewer than two directories.
Verify that when the user specifies +N and N exceeds the number of directories in the stack, both pushd and popd use the last directory as the Nth directory.
Modify the getNdirs function so that it checks for the above condition and exits with an appropriate error message if true.
Change getNdirs so that it uses cut (with command substitution), instead of the while loop, to extract the first N directories. This uses less code but runs more slowly because of the extra processes generated.
Copyright © 2003 O'Reilly & Associates. All rights reserved.