
File objects are Python code’s main interface to external files on your computer. Files are a core type, but they’re something of an oddball—there is no specific literal syntax for creating them. Rather, to create a file object, you call the built-in open function, passing in an external filename and a processing mode as strings. For example, to create a text output file, you would pass in its name and the 'w' processing mode string to write data:

>>> f = open('data.txt', 'w')      # Make a new file in output mode
>>> f.write('Hello\n')             # Write strings of bytes to it
>>> f.write('world\n')             # Returns number of bytes written in Python 3.0
>>> f.close()                      # Close to flush output buffers to disk

This creates a file in the current directory and writes text to it (the filename can be a full directory path if you need to access a file elsewhere on your computer). To read back what you just wrote, reopen the file in 'r' processing mode, for reading text input—this is the default if you omit the mode in the call. Then read the file’s content into a string, and display it. A file’s contents are always a string in your script, regardless of the type of data the file contains:

>>> f = open('data.txt')           # 'r' is the default processing mode
>>> text =                # Read entire file into a string
>>> text

>>> print(text)                    # print interprets control characters

>>> text.split()                   # File content is always a string
['Hello', 'world']

Other file object methods support additional features we don’t have time to cover here. For instance, file objects provide more ways of reading and writing (read accepts an optional byte size, readline reads one line at a time, and so on), as well as other tools (seek moves to a new file position). As we’ll see later, though, the best way to read a file today is to not read it at all—files provide an iterator that automatically reads line by line in for loops and other contexts.

We’ll meet the full set of file methods later in this book, but if you want a quick preview now, run a dir call on any open file and a help on any of the method names that come back:

>>> dir(f)
[ ...many names omitted...
'buffer', 'close', 'closed', 'encoding', 'errors', 'fileno', 'flush', 'isatty',
'line_buffering', 'mode', 'name', 'newlines', 'read', 'readable', 'readline',
'readlines', 'seek', 'seekable', 'tell', 'truncate', 'writable', 'write',

...try it and see...

Later in the book, we’ll also see that files in Python 3.0 draw a sharp distinction between text and binary data. Text files represent content as strings and perform Unicode encoding and decoding automatically, while binary files represent content as a special bytes string type and allow you to access file content unaltered (the following partial example assumes there is already a binary file in your current directory):

>>> data = open('data.bin', 'rb').read()       # Open binary file
>>> data                                       # bytes string holds binary data
>>> data[4:8]

Although you won’t generally need to care about this distinction if you deal only with ASCII text, Python 3.0’s strings and files are an asset if you deal with internationalized applications or byte-oriented data.