第124页 | Learning the Bash Shell-Cameron Newham &

已读55%
预计阅读本页时间：-

read

The other half of the shell's string I/O facilities is the read command, which allows you to read values into shell variables. The basic syntax is:

read var1 var2...

广告：个人专属 VPN，独立 IP，无限流量，多机房切换，还可以屏蔽广告和恶意软件，每月最低仅 5 美元

This statement takes a line from the standard input and breaks it down into words delimited by any of the characters in the value of the environment variable IFS (see Chapter 4; these are usually a space, a TAB, and NEWLINE). The words are assigned to variables var1, var2, etc. For example:

$ read character1 character2alice duchess$ echo $character1alice
$ echo $character2duchess

If there are more words than variables, then excess words are assigned to the last variable. If you omit the variables altogether, the entire line of input is assigned to the variable REPLY.

You may have identified this as the "missing ingredient" in the shell programming capabilities we have seen thus far. It resembles input statements in conventional languages, like its namesake in Pascal. So why did we wait this long to introduce it?

Actually, read is sort of an "escape hatch" from traditional shell programming philosophy, which dictates that the most important unit of data to process is a text file, and that UNIX utilities such as cut, grep, sort, etc., should be used as building blocks for writing programs.

read, on the other hand, implies line-by-line processing. You could use it to write a shell script that does what a pipeline of utilities would normally do, but such a script would inevitably look like:

while (read a line) do
process the line
print the processed line
end

This type of script is usually much slower than a pipeline; furthermore, it has the same form as a program someone might write in C (or some similar language) that does the same thing much faster. In other words, if you are going to write it in this line-by-line way, there is little point in writing a shell script.

Reading lines from files

Nevertheless, shell scripts with read are useful for certain kinds of tasks. One is when you are reading data from a file small enough so that efficiency isn't a concern (say a few hundred lines or less), and it's really necessary to get bits of input into shell variables.

Consider the case of a UNIX machine that has terminals that are hardwired to the terminal lines of the machine. It would be nice if the TERM environment variable was set to the correct terminal type when a user logged in.

One way to do this would be to have some code that sets the terminal information when a user logs in. This code would presumably reside in /etc/profile, the system-wide initialization file that bash runs before running a user's .bash_profile. If the terminals on the system change over time—as surely they must—then the code would have to be changed. It would be better to store the information in a file and change just the file instead.

Assume we put the information in a file whose format is typical of such UNIX "system configuration" files: each line contains a device name, a TAB, and a TERM value.

We'll call the file /etc/terms, and it would typically look something like this:

console console
tty01 wy60
tty03 vt100
tty04 vt100
tty07 wy85
tty08 vt100

The values on the left are terminal lines and those on the right are the terminal types that TERM can be set to. The terminals connected to this system are a Wyse 60 (wy60), three VT100s (vt100), and a Wyse 85 (wy85). The machines' master terminal is the console, which has a TERM value of console.

We can use read to get the data from this file, but first we need to know how to test for the end-of-file condition. Simple: read's exit status is 1 (i.e., non-zero) when there is nothing to read. This leads to a clean while loop:

TERM=vt100 # assume this as a default
line=$(tty)
while read dev termtype; do
   if [ $dev = $line ]; then
   TERM=$termtype
   echo "TERM set to $TERM."
   break
   fi
done

The while loop reads each line of the input into the variables dev and termtype. In each pass through the loop, the if looks for a match between $dev and the user's tty ($line, obtained by command substitution from the tty command). If a match is found, TERM is set, a message is printed, and the loop exits; otherwise TERM remains at the default setting of vt100.

We are not quite done, though: this code reads from the standard input, not from /etc/terms! We need to know how to redirect input to multiple commands. It turns out that there are a few ways of doing this.

I/O redirection and multiple commands

One way to solve the problem is with a subshell, as we'll see in the next chapter. This involves creating a separate process to do the reading. However, it is usually more efficient to do it in the same process; bash gives us four ways of doing this.

The first, which we have seen already, is with a function:

findterm ( ) {
   TERM=vt100 # assume this as a default
   line=$(tty)
   while read dev termtype; do
   if [ $dev = $line ]; then
   TERM=$termtype
   echo "TERM set to $TERM."
   break;
   fi
   done
}

findterm < /etc/terms

A function acts like a script in that it has its own set of standard I/O descriptors, which can be redirected in the line of code that calls the function. In other words, you can think of this code as if findterm were a script and you typed findterm < /etc/terms on the command line. The read statement takes input from /etc/terms a line at a time, and the function runs correctly.

The second way is to simplify this slightly by placing the redirection at the end of the function:

Whenever findterm is called, it takes its input from /etc/terms.

The third way is by putting the I/O redirector at the end of the loop, like this:

TERM=vt100 # assume this as a default
line=$(tty)
while read dev termtype; do
   if [ $dev = $line ]; then
   TERM=$termtype
   echo "TERM set to $TERM."
   break;
   fi
done < /etc/terms

You can use this technique with any flow-control construct, including if...fi, case...esac, select...done, and until...done. This makes sense because these are all compound statements that the shell treats as single commands for these purposes. This technique works fine—the read command reads a line at a time—as long as all of the input is done within the compound statement.

Command blocks

But if you want to redirect I/O to or from an arbitrary group of commands without creating a separate process, you need to use a construct that we haven't seen yet. If you surround some code with { and }, the code will behave like a function that has no name. This is another type of compound statement. In accordance with the equivalent concept in the C language, we'll call this a command block.

What good is a block? In this case, it means that the code within the curly brackets ({}) will take standard I/O descriptors just as we described in the last block of code. This construct is appropriate for the current example because the code needs to be called only once, and the entire script is not really large enough to merit breaking down into functions. Here is how we use a block in the example:

{
   TERM=vt100 # assume this as a default
   line=$(tty)
   while read dev termtype; do
   if [ $dev = $line ]; then
   TERM=$termtype
   echo "TERM set to $TERM."
   break;
   fi
   done
} < /etc/terms

To help you understand how this works, think of the curly brackets and the code inside them as if they were one command, i.e.:

{ TERM=vt100; line=$(tty); while ... } < /etc/terms;

Configuration files for system administration tasks like this one are actually fairly common; a prominent example is /etc/hosts, which lists machines that are accessible in a TCP/IP network. We can make /etc/terms more like these standard files by allowing comment lines in the file that start with #, just as in shell scripts. This way /etc/terms can look like this:

#
# System Console is console
console console
#
# Cameron's line has a Wyse 60
tty01 wy60
...

We can handle comment lines by modifying the while loop so that it ignores lines beginning with #. We can place a grep in the test:

if [ -z "$(echo $dev | grep ^#)" ] && [ $dev = $line ]; then
...

As we saw in Chapter 5, the && combines the two conditions so that both must be true for the entire condition to be true.

As another example of command blocks, consider the case of creating a standard algebraic notation frontend to the dc command. dc is a UNIX utility that simulates a Reverse Polish Notation (RPN) calculator:^[5]

{ while read line; do
echo "$(alg2rpn $line)"
done
} | dc

We'll assume that the actual conversion from one notation to the other is handled by a function called alg2rpn. It takes a line of standard algebraic notation as an argument and prints the RPN equivalent on the standard output. The while loop reads lines and passes them through the conversion function, until an EOF is typed. Everything is executed inside the command block and the output is piped to the dc command for evaluation.

Reading user input

The other type of task to which read is suited is prompting a user for input. Think about it: we have hardly seen any such scripts so far in this book. In fact, the only ones were the modified solutions to Task 5-4, which involved select.

As you've probably figured out, read can be used to get user input into shell variables.

We can use echo to prompt the user, like this:

echo -n 'terminal? '
read TERM
echo "TERM is $TERM"

Here is what this looks like when it runs:

terminal? wy60TERM is wy60

However, shell convention dictates that prompts should go to standard error, not standard output. (Recall that select prompts to standard error.) We could just use file descriptor 2 with the output redirector we saw earlier in this chapter:

echo -n 'terminal? ' >&2
read TERM
echo TERM is $TERM

We'll now look at a more complex example by showing how Task 5-5 would be done if select didn't exist. Compare this with the code in Chapter 5:

echo 'Select a directory:'
done=false

while [ $done = false ]; do
   do=true
   num=1
   for direc in $DIR_STACK; do
   echo $num) $direc
   num=$((num+1))
   done
   echo -n 'directory? '
   read REPLY

   if [ $REPLY -lt $num ] && [ $REPLY -gt 0 ]; then
   set - $DIR_STACK

   #statements that manipulate the stack...

   break
   else
   echo 'invalid selection.'
   fi
done

The while loop is necessary so that the code repeats if the user makes an invalid choice. select includes the ability to construct multicolumn menus if there are many choices, and better handling of null user input.

Before leaving read, we should note that it has eight options: -a, -d, -e, -n, -p, -r, -t, and -s.^[6] The first of these options allows you to read values into an array. Each successive item read in is assigned to the given array starting at index 0. For example:

$ read -a people
alice duchess dodo
$ echo ${people[2]}
dodo
$

In this case, the array people now contains the items alice, duchess, and dodo.

A delimiter can be specified with the -d option. This will read a line up until the first character of the delimiter is reached. For example:

$ read -s stop aline
alice duches$
$ echo $aline
alice duche
$

The option -e can be used only with scripts run from interactive shells. It causes readline to be used to gather the input line, which means that you can use any of the readline editing features that we looked at in Chapter 2.

The -n option specifies how many characters will be read by read. For example, if we specify that it should read only ten characters in then it will return after reading that many:

$ read -n 10 aline
abcdefghij$
$ echo $aline
abcdefghij
$

The -p option followed by a string argument prints the string before reading input. We could have used this in the earlier examples of read, where we printed out a prompt before doing the read. For example, the directory selection script could have used read -p `directory?' REPLY.

read lets you input lines that are longer than the width of your display by providing a backslash (\) as a continuation character, just as in shell scripts. The -r option overrides this, in case your script reads from a file that may contain lines that happen to end in backslashes. read -r also preserves any other escape sequences the input might contain. For example, if the file hatter contains this line:

A line with a\n escape sequence

Then read -r aline will include the backslash in the variable aline, whereas without the -r, read will "eat" the backslash. As a result:

$ read -r aline < hatter$ echo -e "$aline"
A line with a
escape sequence
$

However:

$ read aline < hatter$ echo -e "$aline"
A line with an escape sequence
$

The -s option forces read to not echo the characters that are typed to the terminal. This can be useful in cases where a shell may want to take single keystroke commands without displaying the typed characters on the terminal (e.g., moving something around with the arrow keys). In this case it could be combined with the -n option to read a single character each time in a loop: read -s -n1 key

The last option, -t, allows a time in seconds to be specified. read will wait the specified time for input and then finish. This is useful if you want a script to wait for input but continue processing if nothing is supplied.

^[2]You must use a double backslash if you don't surround the string that contains them with quotes; otherwise, the shell itself "steals" a backslash before passing the arguments to echo.

^[4]printf is not available in versions of bash prior to version 2.02.

^[5]If you have ever owned a Hewlett-Packard calculator you will be familiar with RPN. We'll discuss RPN further in one of the exercises at the end of this chapter.

^[6]-a, -d, -e, -n, -p, -t and -s are not available in versions of bash prior to 2.0.