Command-Line Processing

We've seen how the shell uses read to process input lines: it deals with single quotes (`'), double quotes (""), and backslashes (\); it separates lines into words, according to delimiters in the environment variable IFS; and it assigns the words to shell variables. We can think of this process as a subset of the things the shell does when processing command lines.

We've touched upon command-line processing throughout this book; now is a good time to make the whole thing explicit. Each line that the shell reads from the standard input or a script is called a pipeline; it contains one or more commands separated by zero or more pipe characters (|). For each pipeline it reads, the shell breaks it up into commands, sets up the I/O for the pipeline, then does the following for each command (Figure 7-1):

广告:个人专属 VPN,独立 IP,无限流量,多机房切换,还可以屏蔽广告和恶意软件,每月最低仅 5 美元

阅读 ‧ 电子书库

Figure 7-1. Steps in command-line processing

 

 
  1. Splits the command into tokens that are separated by the fixed set of metacharacters: SPACE, TAB, NEWLINE, ;, (, ), <, >, |, and &. Types of tokens include words, keywords, I/O redirectors, and semicolons.
  2. Checks the first token of each command to see if it is a keyword with no quotes or backslashes. If it's an opening keyword, such as if and other control-structure openers, function, {, or (, then the command is actually a compound command. The shell sets things up internally for the compound command, reads the next command, and starts the process again. If the keyword isn't a compound command opener (e.g., is a control-structure "middle" like then, else, or do, an "end" like fi or done, or a logical operator), the shell signals a syntax error.
  3. Checks the first word of each command against the list of aliases. If a match is found, it substitutes the alias's definition and goes back to Step 1; otherwise, it goes on to Step 4. This scheme allows recursive aliases (see Chapter 3). It also allows aliases for keywords to be defined, e.g., alias aslongas=while or alias procedure=function.
  4. Performs brace expansion. For example, a{b,c} becomes ab ac.
  5. Substitutes the user's home directory ($HOME) for tilde if it is at the beginning of a word. Substitutes user's home directory for ~user.[7]
  6. Performs parameter (variable) substitution for any expression that starts with a dollar sign ($).
  7. Does command substitution for any expression of the form $(string).
  8. Evaluates arithmetic expressions of the form $((string)).
  9. Takes the parts of the line that resulted from parameter, command, and arithmetic substitution and splits them into words again. This time it uses the characters in $IFS as delimiters instead of the set of metacharacters in Step 1.
  10. Performs pathname expansion, a.k.a. wildcard expansion, for any occurrences of *, ?, and [/] pairs.
  11. Uses the first word as a command by looking up its source according to the rest of the list in Chapter 4, i.e., as a function command, then as a built-in, then as a file in any of the directories in $PATH.
  12. Runs the command after setting up I/O redirection and other such things.

That's a lot of steps—and it's not even the whole story! But before we go on, an example should make this process clearer. Assume that the following command has been run:

alias ll="ls -l"

Further assume that a file exists called .hist537 in user alice's home directory, which is /home/alice, and that there is a double-dollar-sign variable $$ whose value is 2537 (we'll see what this special variable is in the next chapter).

Now let's see how the shell processes the following command:

ll $(type -path cc) ~alice/.*$(($$%1000))

Here is what happens to this line:

 

 
  1. ll $(type -path cc) ~alice/.*$(($$%1000)) splits the input into words.
  2. ll is not a keyword, so Step 2 does nothing.
  3. ls -l $(type -path cc) ~alice/.*$(($$%1000)) substitutes ls -l for its alias "ll". The shell then repeats Steps 1 through 3; Step 2 splits the ls -l into two words.
  4. ls -l $(type -path cc) ~alice/.*$(($$%1000)) does nothing.
  5. ls -l $(type -path cc) /home/alice/.*$(($$%1000)) expands ~alice into /home/alice.
  6. ls -l $(type -path cc) /home/alice/.*$((2537%1000)) substitutes 2537 for $$.
  7. ls -l /usr/bin/cc /home/alice/.*$((2537%1000)) does command substitution on "type -path cc".
  8. ls -l /usr/bin/cc /home/alice/.*537 evaluates the arithmetic expression 2537%1000.
  9. ls -l /usr/bin/cc /home/alice/.*537 does nothing.
  10. ls -l /usr/bin/cc /home/alice/.hist537 substitutes the filename for the wildcard expression .*537.
  11. The command ls is found in /usr/bin.
  12. /usr/bin/ls is run with the option -l and the two arguments.

Although this list of steps is fairly straightforward, it is not the whole story. There are still five ways to modify the process: quoting; using command, builtin, or enable; and using the advanced command eval.