Command-Line Processing
We've seen how the shell uses read to process input lines: it deals with single quotes (`'), double quotes (""), and backslashes (\); it separates lines into words, according to delimiters in the environment variable IFS; and it assigns the words to shell variables. We can think of this process as a subset of the things the shell does when processing command lines.
We've touched upon command-line processing throughout this book; now is a good time to make the whole thing explicit. Each line that the shell reads from the standard input or a script is called a pipeline; it contains one or more commands separated by zero or more pipe characters (|). For each pipeline it reads, the shell breaks it up into commands, sets up the I/O for the pipeline, then does the following for each command (Figure 7-1):
Figure 7-1. Steps in command-line processing
Splits the command into tokens that are separated by the fixed set of metacharacters: SPACE, TAB, NEWLINE,
;,
(,
),
<,
>,
|, and
&. Types of tokens include words, keywords, I/O redirectors, and semicolons. Checks the first token of each command to see if it is a keyword with no quotes or backslashes. If it's an opening keyword, such as
if and other control-structure openers,
function,
{, or
(, then the command is actually a compound command. The shell sets things up internally for the compound command, reads the next command, and starts the process again. If the keyword isn't a compound command opener (e.g., is a control-structure "middle" like
then,
else, or
do, an "end" like
fi or
done, or a logical operator), the shell signals a syntax error. Checks the first word of each command against the list of aliases. If a match is found, it substitutes the alias's definition and goes back to Step 1; otherwise, it goes on to Step 4. This scheme allows recursive aliases (see Chapter 3). It also allows aliases for keywords to be defined, e.g.,
alias aslongas=while or
alias procedure=function. Performs brace expansion. For example,
a{b,c} becomes
ab ac. Substitutes the user's home directory (
$HOME) for tilde if it is at the beginning of a word. Substitutes user's home directory for
~user.
[7] Performs parameter (variable) substitution for any expression that starts with a dollar sign (
$). Does command substitution for any expression of the form
$(string
). Evaluates arithmetic expressions of the form
$((string
)). Takes the parts of the line that resulted from parameter, command, and arithmetic substitution and splits them into words again. This time it uses the characters in
$IFS as delimiters instead of the set of metacharacters in Step 1. Performs pathname expansion, a.k.a. wildcard expansion, for any occurrences of
*,
?, and
[/] pairs. Uses the first word as a command by looking up its source according to the rest of the list in Chapter 4, i.e., as a function command, then as a built-in, then as a file in any of the directories in
$PATH. Runs the command after setting up I/O redirection and other such things.
That's a lot of steps—and it's not even the whole story! But before we go on, an example should make this process clearer. Assume that the following command has been run:
alias ll="ls -l"
Further assume that a file exists called .hist537 in user alice's home directory, which is /home/alice, and that there is a double-dollar-sign variable $$ whose value is 2537 (we'll see what this special variable is in the next chapter).
Now let's see how the shell processes the following command:
ll $(type -path cc) ~alice/.*$(($$%1000))
Here is what happens to this line:
ll
$(type
-path
cc)
~alice/.*$(($$%1000)) splits the input into words. ll is not a keyword, so Step 2 does nothing. ls
-l
$(type
-path
cc)
~alice/.*$(($$%1000)) substitutes
ls -l for its alias "ll". The shell then repeats Steps 1 through 3; Step 2 splits the
ls -l into two words. ls
-l
$(type
-path
cc)
~alice/.*$(($$%1000)) does nothing. ls
-l
$(type
-path
cc)
/home/alice/.*$(($$%1000)) expands
~alice into /home/alice. ls
-l
$(type
-path
cc)
/home/alice/.*$((2537%1000)) substitutes
2537 for
$$. ls
-l
/usr/bin/cc
/home/alice/.*$((2537%1000)) does command substitution on "type -path cc". ls
-l
/usr/bin/cc
/home/alice/.*537 evaluates the arithmetic expression
2537%1000. ls
-l
/usr/bin/cc
/home/alice/.*537 does nothing. ls
-l
/usr/bin/cc
/home/alice/.hist537 substitutes the filename for the wildcard expression .*537. The command
ls is found in /usr/bin. /usr/bin/ls is run with the option -l and the two arguments.
Although this list of steps is fairly straightforward, it is not the whole story. There are still five ways to modify the process: quoting; using command, builtin, or enable; and using the advanced command eval.