Every open file has an implicit pointer which indicates where data will be read and written. Normally this defaults to the start of the file, but if you use a mode of a
(append) then it defaults to the end of the file. It's also worth noting that the w
mode will truncate your file (i.e. delete all the contents) even if you add +
to the mode.
Whenever you read or write N characters, the read/write pointer will move forward that amount within the file. I find it helps to think of this like an old cassette tape, if you remember those. So, if you executed the following code:
fd = open("testfile.txt", "w+")
fd.write("This is a test file.\n")
fd.close()
fd = open("testfile.txt", "r+")
print fd.read(4)
fd.write(" IS")
fd.close()
... It should end up printing This
and then leaving the file content as This IS a test file.
. This is because the initial read(4)
returns the first 4 characters of the file, because the pointer is at the start of the file. It leaves the pointer at the space character just after This
, so the following write(" IS")
overwrites the next three characters with a space (the same as is already there) followed by IS
, replacing the existing is
.
You can use the seek()
method of the file to jump to a specific point. After the example above, if you executed the following:
fd = open("testfile.txt", "r+")
fd.seek(10)
fd.write("TEST")
fd.close()
... Then you'll find that the file now contains This IS a TEST file.
.
All this applies on Unix systems, and you can test those examples to make sure. However, I've had problems mixing read()
and write()
on Windows systems. For example, when I execute that first example on my Windows machine then it correctly prints This
, but when I check the file afterwards the write()
has been completely ignored. However, the second example (using seek()
) seems to work fine on Windows.
In summary, if you want to read/write from the middle of a file in Windows I'd suggest always using an explicit seek()
instead of relying on the position of the read/write pointer. If you're doing only reads or only writes then it's pretty safe.
One final point - if you're specifying paths on Windows as literal strings, remember to escape your backslashes:
fd = open("C:\\Users\\johndoe\\Desktop\\testfile.txt", "r+")
Or you can use raw strings by putting an r
at the start:
fd = open(r"C:\Users\johndoe\Desktop\testfile.txt", "r+")
Or the most portable option is to use os.path.join()
:
fd = open(os.path.join("C:\\", "Users", "johndoe", "Desktop", "testfile.txt"), "r+")
You can find more information about file IO in the official Python docs.