Using Files

Once you make a file object with open, you can call its methods to read from or write to the associated external file. In all cases, file text takes the form of strings in Python programs; reading a file returns its text in strings, and text is passed to the write methods as strings. Reading and writing methods come in multiple flavors; Table 9-2 lists the most common. Here are a few fundamental usage notes:

File iterators are best for reading lines

广告:个人专属 VPN,独立 IP,无限流量,多机房切换,还可以屏蔽广告和恶意软件,每月最低仅 5 美元

Though the reading and writing methods in the table are common, keep in mind that probably the best way to read lines from a text file today is to not read the file at all—as we’ll see in Chapter 14, files also have an iterator that automatically reads one line at a time in a for loop, list comprehension, or other iteration context.

Content is strings, not objects

Notice in Table 9-2 that data read from a file always comes back to your script as a string, so you’ll have to convert it to a different type of Python object if a string is not what you need. Similarly, unlike with the print operation, Python does not add any formatting and does not convert objects to strings automatically when you write data to a file—you must send an already formatted string. Because of this, the tools we have already met to convert objects to and from strings (e.g., int, float, str, and the string formatting expression and method) come in handy when dealing with files. Python also includes advanced standard library tools for handling generic object storage (such as the pickle module) and for dealing with packed binary data in files (such as the struct module). We’ll see both of these at work later in this chapter.

close is usually optional

Calling the file close method terminates your connection to the external file. As discussed in Chapter 6, in Python an object’s memory space is automatically reclaimed as soon as the object is no longer referenced anywhere in the program. When file objects are reclaimed, Python also automatically closes the files if they are still open (this also happens when a program shuts down). This means you don’t always need to manually close your files, especially in simple scripts that don’t run for long. On the other hand, including manual close calls can’t hurt and is usually a good idea in larger systems. Also, strictly speaking, this auto-close-on-collection feature of files is not part of the language definition, and it may change over time. Consequently, manually issuing file close method calls is a good habit to form. (For an alternative way to guarantee automatic file closes, also see this section’s later discussion of the file object’s context manager, used with the new with/as statement in Python 2.6 and 3.0.)

Files are buffered and seekable.

The prior paragraph’s notes about closing files are important, because closing both frees up operating system resources and flushes output buffers. By default, output files are always buffered, which means that text you write may not be transferred from memory to disk immediately—closing a file, or running its flush method, forces the buffered data to disk. You can avoid buffering with extra open arguments, but it may impede performance. Python files are also random-access on a byte offset basis—their seek method allows your scripts to jump around to read and write at specific locations.