Working with text file in python

Creating a text file in python

%%writefile test.txt ( Use this magic command before the text)

Hello, this is a new file create using python ide.
This the second line of the file.
Writing test.txt

Understanding the location of the file

Give the location of the present working directory

pwd()
'C:\\Users\\Vicky.Crasto\\OneDrive - Unilever\\Work_file_082021\\05_Other_learning\\NLP\\01_Udemy_JoseP'

Opening the file

myfile = open("test.txt")
myfile
<_io.TextIOWrapper name='test.txt' mode='r' encoding='cp1252'>

This is the location in the memory which hold the file.

Using .read() and .seek()

myfile.read()
'Hello, this is a new file create using python ide.\nThis the second line of the file.\n'
myfile.read()
''

The second time the function it called it does not give any output since the cursor has reached the end of the document. There is nothing more read. Hence we need to reset the cursor to the start.

Resetting the cursor

myfile.seek(0)
0
myfile.read()
'Hello, this is a new file create using python ide.\nThis the second line of the file.\n'

Using .readlines()

readlines() help to read the file line by line. Note: All the data is helded in the memory, hence large files will need to handled carefully.

myfile.seek(0)
myfile.readlines()
['Hello, this is a new file create using python ide.\n',
 'This the second line of the file.\n']
myfile.close()

Writing a file - Understanding the mode

While opening the file, we can open it with different modes

  • 'r' default to read the file
  • 'w+' read and write the file.(Overwrites the existing file)
  • 'wb+' read and write as binary (used in case of pdf)
myfile = open("test.txt", mode= 'w+')
myfile.write("This is an additional file")
26
myfile.seek(0)
myfile.readlines()
['This is an additional file']

Hence the existing data is deleted and the new data is overwrite.

myfile.close()

Appending a file

Passing the argument 'a' opens the file and puts the pointer at the end, so anything written is appended.

myfile = open("test.txt", 'a+')
myfile.write("\nAppending a new line to the existing line")
myfile.seek(0)
print(myfile.read())
This is an additional file
Appending a new line to the existing line
Appending a new line to the existing line
Appending a new line to the existing line
Appending a new line to the existing line
myfile.close()

Aliases and context managers

You can assign temporary variable names as aliases, and manage the opening and closing of files automatically using a context manager:

with open('test.txt','r') as txt:
    first_line = txt.readlines()[0]
print(first_line)
This is an additional file

By using this method, the file is opened, read and closed by context mananger automatically after doing the specified operation.

first_line
'This is an additional file\n'
txt.read()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
C:\Users\VICKY~1.CRA\AppData\Local\Temp/ipykernel_9856/1416744708.py in <module>
----> 1 txt.read()

ValueError: I/O operation on closed file.

Hence the extract line remain in the object but the file is closed by the context manager.

Iterating through a file

with open('test.txt', 'r') as txt:
    for line in txt:
        print(line , end='$$$$')
This is an additional file
$$$$Appending a new line to the existing line
$$$$Appending a new line to the existing line
$$$$Appending a new line to the existing line
$$$$Appending a new line to the existing line$$$$